REPORT  DOCUMENTATION  PAGE 


Form  Approved  OMB  NO.  0704-0188 


The  public  reporting  burden  for  this  collection  of  information  is  estimated  to  average  1  hour  per  response,  including  the  time  for  reviewing  instructions, 
searching  existing  data  sources,  gathering  and  maintaining  the  data  needed,  and  completing  and  reviewing  the  collection  of  information.  Send  comments 

regarding  this  burden  estimate  or  any  other  aspect  of  this  collection  of  information,  including  suggesstions  for  reducing  this  burden,  to  Washington 

Headquarters  Services,  Directorate  for  Information  Operations  and  Reports,  1215  Jefferson  Davis  Highway,  Suite  1204,  Arlington  VA,  22202-4302. 
Respondents  should  be  aware  that  notwithstanding  any  other  provision  of  law,  no  person  shall  be  subject  to  any  oenalty  for  failing  to  comply  with  a  collection 
of  information  if  it  does  not  display  a  currently  valid  OMB  control  number. 


PLEASE  DO  NOT  RETURN  YOUR  FORM  TO  THE  ABOVE  ADDRESS. 


1.  REPORT  DATE  (DD-MM-YYYY) 

29-10-2007 

2.  REPORT  TYPE 

Final  Report 

3.  DATES  COVERED  (From  -  To) 

15-Jun-2006  -  14-Jun-2007 

4.  TITLE  AND  SUBTITLE 

SEPARATED  REPRESENTATIONS  AND  FAST  ALGORITHMS  FOR 
MATERIALS  SCIENCE 

5a.  CONTRACT  NUMBER 

W91  INF-06-1-0254 

5b.  GRANT  NUMBER 

5c.  PROGRAM  ELEMENT  NUMBER 

6D10S7 

6.  AUTHORS 

Gregory  Beylkin,  Lucas  Monzon  and  Fernando  Perez 

5d.  PROJECT  NUMBER 

5e.  TASK  NUMBER 

5f.  WORK  UNIT  NUMBER 

7.  PERFORMING  ORGANIZATION  NAMES  AND  ADDRE: 

University  of  Colorado  -  Boulder 
Office  of  Contracts  and  Grants 
Campus  Box  572,  3100  Marine  Street  Rm  481 
Boulder,  CO _ 80309  -0572 

9.  SPONSORING/MONITORING  AGENCY  NAME(S)  AND 
ADDRESS(ES) 

U.S.  Army  Research  Office 
P.O.Box  12211 

Research  Triangle  Park,  NC  27709-2211 

12.  DISTRIBUTION  AVAILIBILITY  STATEMENT 
Approved  for  public  release;  Federal  purpose  rights 

13.  SUPPLEMENTARY  NOTES 
The  views,  opinions  and/or  findings  contained  in  this  report  are  those  of  the  author(s)  and  should  not  contmed  as  an  official  Department 
of  the  Army  position,  policy  or  decision,  unless  so  designated  by  other  documentation. 

14.  ABSTRACT 

Our  goal  (within  the  time-frame  of  the  grant)  was  to  finish  the  development 
of  algorithms  and  software  for  applying  Green's  functions  (and  other 
operators)  and  to  develop  and  test  algorithms  for  computing  multiparticle 
wave  functions,  both  based  based  on  representing  operators  and  functions 
of  many  variables  as  short  sums  of  separable  functions,  the  so-called 
separated  representations.  Our  approach  is  different  from  the  Fast 


15.  SUBJECT  TERMS 

Separated  representations,  Green's  function,  fast  algorithms,  Schrodinger  equation 


16.  SECURITY  CLASSIFICATION  OF: 

17.  LIMITATION  OF 

ABSTRACT 

SAR 

15.  NUMBER 

OF  PAGES 

19a.  NAME  OF  RESPONSIBLE  PERSON 

Gregory  Beylkin 

a.  REPORT 

U 

b.  ABSTRACT 

U 

c.  THIS  PAGE 

U 

19b.  TELEPHONE  NUMBER 

303-492-6935 

Standard  Form  298  (Rev  8/98) 


8.  PERFORMING  ORGANIZATION  REPORT 
NUMBER 


10.  SPONSOR/MONITOR'S  ACRONYM(S) 
ARO 

11.  SPONSOR/MONITOR'S  REPORT 
NUMBER(S) 

50966-PH-DRP.1 


Prescribed  by  ANSI  Std.  Z39. 1 8 


Report  Title 

SEPARATED  REPRESENTATIONS  AND  FAST  ALGORITHMS  FOR  MATERIALS  SCIENCE 

ABSTRACT 

Our  goal  (within  the  time-frame  of  the  grant)  was  to  finish  the  development 
of  algorithms  and  software  for  applying  Green's  functions  (and  other 
operators)  and  to  develop  and  test  algorithms  for  computing  multiparticle 
wave  functions,  both  based  based  on  representing  operators  and  functions 
of  many  variables  as  short  sums  of  separable  functions,  the  so-called 
separated  representations.  Our  approach  is  different  from  the  Fast 
Multipole  Method  and  allows  us  to  develop  and  use  fast  algorithms 
in  high  dimensions. 

We  have  accomplished  these  tasks  with  the  results  described  in  two 
papers  attached  to  this  report.  We  started  the  process  of  comparing 
our  approach  to  solving  multiparticle  Schrodinger  equation  to  that 
currently  used  in  Quantum  Chemistry. 


List  of  papers  submitted  or  published  that  acknowledge  ARO  support  during  this  reporting 
period.  List  the  papers,  including  journal  references,  in  the  following  categories: 

(a)  Papers  published  in  peer-reviewed  journals  (N/A  for  none) 

G.  Beylkin,  V.  Cheruvu,  and  F.  Perez 

Fast  adaptive  algorithms  in  the  non-standard  form  for  multidimensional  problems,  Appl.  Comput.  Harmon.  Anal.  Accepted  for  publication. 
(2007)  APPM  Preprint  550,  available  electronically  from  Appl.  Comput.  Harmon.  Anal,  website 

G.  Beylkin,  M.J.  Mohlenkamp,  and  F.  Perez,  Approximating  a  wavefunction  as  an  unconstrained  sum  of  Slater  determinants,  Submitted  to  J. 
Math.  Phys.,  (2007).  APPM  Preprint  554. 

Number  of  Papers  published  in  peer-reviewed  journals:  2.00 


(b)  Papers  published  in  non-peer-reviewed  journals  or  in  conference  proceedings  (N/A  for  none) 


Number  of  Papers  published  in  non  peer-reviewed  journals:  0.00 


(c)  Presentations 


Gregory  Beylkin,  Martin  J.  Mohlenkamp,  and  Fernando  Perez, 

"Toward  Solving  the  Multiparticle  Schrodinger  Equation 
via  an  Unconstrained  Sum  of  Slater  Determinants", 

Pacific  Northwest  National  Lab,  Richland,  WA,  January  25-26,  2007 

Gregory  Beylkin,  "Fast  algorithms  for  adaptive  application  of  integral  operators  in  high  dimensions",  The  2007  John  H.  Barrett  Memorial 
Lectures,  University  of  Tennessee,  April  28,  2007 

Gregory  Beylkin,  Martin  J.  Mohlenkamp,  and  Fernando  Perez,  "Preliminary  results  on  approximating  a  wavefunction  as  an  unconstrained 
sum  of  Slater  determinants",  ICIAM,  Zurich,  Switzerland,  July  17,  2007 

Number  of  Presentations:  3.00 


Non  Peer-Reviewed  Conference  Proceeding  publications  (other  than  abstracts): 


Number  of  Non  Peer-Reviewed  Conference  Proceeding  publications  (other  than  abstracts): 


0 


Peer-Reviewed  Conference  Proceeding  publications  (other  than  abstracts): 


Number  of  Peer-Reviewed  Conference  Proceeding  publications  (other  than  abstracts): 


0 


(d)  Manuscripts 


Number  of  Manuscripts:  0.00 


Number  of  Inventions: 


Graduate  Students 

NAME 

PERCENT  SUPPORTED 

FTE  Equivalent: 

Total  Number: 

Names  of  Post  Doctorates 


NAME 

PERCENT  SUPPORTED 

Lucas  Monzon 

0.25 

Fernando  Perez 

0.25 

FTE  Equivalent: 

0.50 

Total  Number: 

2 

Names  of  Faculty  Supported 


NAME 

PERCENT  SUPPORTED 

National  Academy  Member 

Gregory  Beylkin 

0.08 

No 

FTE  Equivalent: 

0.08 

Total  Number: 

1 

Names  of  Under  Graduate  students  supported 


NAME  PERCENT  SUPPORTED 

FTE  Equivalent: 

Total  Number: 


Student  Metrics 

This  section  only  applies  to  graduating  undergraduates  supported  by  this  agreement  in  this  reporting  period 

The  number  of  undergraduates  funded  by  this  agreement  who  graduated  during  this  period: . 

The  number  of  undergraduates  funded  by  this  agreement  who  graduated  during  this  period  with  a  degree  in 

science,  mathematics,  engineering,  or  technology  fields: . 

The  number  of  undergraduates  funded  by  your  agreement  who  graduated  during  this  period  and  will  continue 

to  pursue  a  graduate  or  Ph.D.  degree  in  science,  mathematics,  engineering,  or  technology  fields: . 

Number  of  graduating  undergraduates  who  achieved  a  3.5  GPA  to  4.0  (4.0  max  scale): . 

Number  of  graduating  undergraduates  funded  by  a  DoD  funded  Center  of  Excellence  grant  for 

Education,  Research  and  Engineering: . 

The  number  of  undergraduates  funded  by  your  agreement  who  graduated  during  this  period  and  intend  to 

work  for  the  Department  of  Defense . 

The  number  of  undergraduates  funded  by  your  agreement  who  graduated  during  this  period  and  will  receive 
scholarships  or  fellowships  for  further  studies  in  science,  mathematics,  engineering  or  technology  fields: . 


Names  of  Personnel  receiving  masters  degrees 


NAME 

Total  Number: 

Names  of  personnel  receiving  PHDs 

NAME 

Total  Number: 

Names  of  other  research  staff 

NAME  PERCENT  SUPPORTED 

FTE  Equivalent: 

Total  Number: 

Sub  Contractors  (DD882) 

Inventions  (DD882) 


FINAL  REPORT  ’’SEPARATED  REPRESENTATIONS  AND  FAST 
ALGORITHMS  FOR  MATERIALS  SCIENCE” 

AWARD  W911NF-06- 1-0254 
PERIOD:  JUNE  15,  2006  TO  JUNE  14,  2007 

GREGORY  BEYLKIN  *  (PI),  LUCAS  A.  MONZON,  AND  FERNANDO  PEREZ 


1.  Abstract 

Our  goal  (within  the  time- frame  of  the  grant)  was  to  finish  the  development  of 
algorithms  and  software  for  applying  Green’s  functions  (and  other  operators)  and 
to  develop  and  test  algorithms  for  computing  multiparticle  wave  functions,  both 
based  based  on  representing  operators  and  functions  of  many  variables  as  short 
sums  of  separable  functions,  the  so-called  separated  representations.  Our  approach 
is  different  from  the  Fast  Multipole  Method  and  allows  us  to  develop  and  use  fast 
algorithms  in  high  dimensions. 

We  have  accomplished  these  tasks  with  the  results  described  in  two  papers  at¬ 
tached  to  this  report.  We  started  the  process  of  comparing  our  approach  to  solving 
multiparticle  Schrodinger  equation  to  that  currently  used  in  Quantum  Chemistry. 


2.  Tasks 

Our  work  addresses  two  tasks  in  using  separated  representations  [5,  4]: 

(1)  The  first  class  of  algorithms  addresses  problems  that  currently  require  either 
the  Fast  Multipole  Method  (FMM)  and/or  Particle  in  cell  methods  and/or 
Ewald  summation  techniques.  Our  approach  is  simpler,  more  flexible  and 
is  as  effective  as  FMM.  It  is  based  on  separated  representations  and  its 
main  advantage  is  that  it  works  for  operators  in  higher  dimensions  (much 
greater  than  three)  provided  that  the  functions  are  maintained  in  separated 
representation  as  well. 

(2)  The  second  class  of  algorithms  addresses  multiparticle  computations  and 
confronts  the  problem  of  dimensionality  directly,  by  providing  a  represen¬ 
tation  of  operators  and  functions  that,  on  one  hand,  achieves  a  finite  but 
arbitrary  accuracy  and,  on  the  other,  limits  the  growth  of  the  computational 
cost  to  a  linear  function  of  the  number  of  particles  (at  least  asymptotically). 
With  this  type  of  algorithms,  we  compete  with  the  Configuration  Interac¬ 
tion  (Cl)  and  of  Coupled  Cluster  (CC)  methods.  Our  approach  is  different 
from  both  Cl  and  CC,  and  holds  the  promise  of  a  breakthrough. 


Department  of  Applied  Mathematics,  University  of  Colorado  at  Boulder;  526  UCB,  Boulder, 
CO  80309-0526;  beylkin@colorado.edu;  phone  303-492-6935,  fax  303-492-4066. 
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3.  Results  for  Task  1:  Multiresolution  separated  representations 

AND  FAST  ALGORITHMS 


Our  goal  has  been  to  obtain  multiresolution  representations  of  Green’s  functions 
in  a  form  that  facilitates  solving  integral  equations.  We  note  that  computations 
involving  multidimensional  free  space  Green’s  functions  are  greatly  simplified  by 
using  separated  representations  [5,  4,  18,  19,  29,  30,  13,  3].  Our  approach  is  based 
on  representing  spherically  symmetric  functions  by  sums  of  products  of  Gaussians. 


3.1.  Separated  representations  for  Poisson-type  kernels.  We  construct  a 
separated  approximation  of  the  function  l/rQ,  where  r  =  ||x||,  x  £  R3  using  a 
collection  of  Gaussians.  The  approximation  is  obtained  by  discretizing  the  integral 


(1) 


1 

■j *  Q! 


2 

I\a/2) 


2  2s  i 

e_r  e  +aS  ^  _ 


For  Q:  =  1  it  is  the  same  integral  as  used  in  the  Ewald  summation  (up  to  a  change 
of  variables,  see  e.g.,  [16]).  We  have 


Proposition  3.1.  For  any  a  >  0,  0  <  S  <  1,  and  0  <  e  <  min  {5,  §},  there  exist 
positive  numbers  pm  and  wm  such  that 


(2) 

where 


M 


-E 


m— 1 


e 

<  — . 

/yiQ 


(3)  M  =  log  e  1  [c0  +  ci  log  e  1+c2log(5  *], 

where  c\,  c2  and  C3  are  constants  that  only  depend  on  a.  For  fixed  power  a  and 
accuracy  e,  we  have  M  =  O(log<5_1). 


A  proof  of  Proposition  3.1  can  be  found  in  [7]. 

Using  r  =  ||ar||,  where  x  =  (aq,  x2,  X3),  and  a  =  1  in  (2),  we  arrive  at  a  separated 
representation  for  the  Poisson  kernel.  Although  in  this  paper  we  compute  the 
lattice  sums  corresponding  to  the  Poisson  kernel,  the  same  approach  will  work  for 
any  a  >  0  as  well  as  other  spherically  symmetric  potentials,  e.g.  Yukawa  potential 
e_Atr/r. 

As  in  [18,  19],  the  approximation  in  (2)  is  obtained  using  trapezoidal  rule.  First, 
we  discretize  the  integral  (1),  namely,  set  pm  =  e2sm  and  wm  =  2As  eaSm /T(a/2), 
where  sm  =  so  +  (to  —  l)As,  to  =  1 . . . ,  M.  For  a  given  accuracy  e  and  range 
0  <  <5  <  r  <  1,  we  select  So  and  sm  =  so  +  {M  —  l)As,  the  end  points  of  the 
interval  of  integration  replacing  the  real  line  in  (1),  so  that  at  these  points  the 
function  f(s)  =  e~r  e  s+as  and  a  sufficient  number  of  its  derivatives  are  close  to 
zero  to  within  the  desired  accuracy.  We  also  select  M,  the  number  of  points  in  the 
quadrature,  so  that  the  accuracy  requirement  is  satisfied. 

Based  on  these  representations,  we  have  implemented  a  fast,  adaptive  multires¬ 
olution  algorithm  for  applying  integral  operators  with  a  wide  class  of  radially  sym¬ 
metric  kernels  in  dimensions  one,  two  and  three.  In  particular,  we  consider  operators 
of  the  class  (— A+^i2/)_a,  where  p  >  0  and  0  <  a  <  3/2,  and  illustrate  performance 
of  the  algorithm  for  the  Poisson  and  Schrodinger  equations  in  dimension  three.  The 
same  algorithm  may  be  used  for  all  operators  with  radially  symmetric  kernels  ap¬ 
proximated  as  a  weighted  sum  of  Gaussians,  making  it  applicable  across  multiple 
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fields  by  reusing  a  single  implementation.  This  fast  algorithm  provides  controllable 
accuracy  at  a  reasonable  cost,  comparable  to  that  of  the  Fast  Multipole  Method 
(FMM).  It  differs  from  the  FMM  by  the  type  of  approximation  used  to  represent 
kernels  and  has  the  advantage  of  being  easily  extendable  to  higher  dimensions. 

The  details  of  the  algorithm  are  described  in  the  paper  by  Gregory  Beylkin, 
Vani  Cheruvu  and  Fernando  Perez  “Fast  Adaptive  Algorithms  in  the  Non-Standard 
Form  for  Multidimensional  Problems”  [2]  accepted  for  publication  in  Applied  and 
Computational  Harmonic  Analysis.  The  paper  is  attached  to  this  report. 


4.  Results  for  Task  2:  Multiparticle  Schrodinger  Equation 

This  part  of  the  project  is  the  beginning  of  the  program  to  develop  accurate 
methods  for  solving  equations  of  multiparticle  quantum  mechanics.  We  have  verified 
correctness  of  our  approach  by  computing  electron  structure  of  a  few  elements.  We 
started  the  process  of  comparing  performance  of  our  algorithm  with  other  methods 
in  Quantum  Chemistry. 


4.1.  Approximating  the  Wavefunction  with  an  unconstrained  sum  of  Slater 
Determinants.  The  multiparticle  Schrodinger  equation  is  the  basic  governing 
equation  in  quantum  mechanics.  We  consider  the  time-independent  case,  and  fix 
the  nuclei  according  to  the  Born-Oppenheimer  approximation,  so  the  equation  de¬ 
scribes  the  steady  state  of  an  interacting  system  of  electrons.  For  each  of  the  N 
electrons  in  the  system  there  are  three  spatial  variables  r  =  ( x ,  y,  z ),  and  a  discrete 
spin  variable  a  taking  the  values  {  —  which  we  combine  and  denote  (r,  a) 

by  7.  The  Hamiltonian  operator  Ti  is  a  sum  of  a  kinetic  energy  operator  T,  a  nu¬ 
clear  potential  operator  V,  and  an  electron-electron  interaction  operator  W,  defined 
by 


1  N  N  1  JV  If  1 

(4)  n  =  T  +  V  +  W  =  --  5>i  +  5>(r0  +  --££ 


2  ^-"77 '  Mr*  -r,-| 

*=1  3^1 


where  A,  is  the  three-dimensional  Laplacian  acting  in  the  variable  r,  and  v(r)  is  a 
sum  of  terms  of  the  form  za/ ||r  —  Ra||  from  a  nucleus  at  position  Ra  with  charge 
za.  The  antisymmetric  eigenfunctions  of  Ti  represent  electronic  states  of  the  system 
and  are  called  wavefunctions.  Antisymmetric  means  that  under  the  exchange  of  any 
two  coordinates,  the  wavefunction  is  odd,  e.g.  1^(72, 71,  •  •  •)  =  — ^(71, 72, . . .).  The 
bound-state  wavefunctions  have  negative  eigenvalues,  and  are  of  greatest  interest,  so 
we  will  focus  on  the  wavefunction  with  the  most  negative  eigenvalue.  In  summary, 
our  goal  is  to  find  the  most  negative  (discrete)  eigenvalue 


(5) 


Hip  =  Xip , 


subject  to  the  antisymmetry  condition  on  ip. 

Analytic  methods  can  give  qualitative  results  about  its  solutions,  and  determine 
limiting  cases,  but  most  quantitative  results  must  be  obtained  numerically.  Al¬ 
though  the  equation  is  a  ‘simple’  eigenvalue  problem,  its  numerical  solution  presents 
several  serious  difficulties,  among  them  the  large  number  of  variables  and  the  an¬ 
tisymmetry  condition  on  the  solution.  The  simplest  method  that  addresses  these 
two  difficulties  is  Hartree-Fock  (HF),  which  uses  the  anti-symmetrization  of  a  single 
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product  to  approximate  the  iV-particle  wavefunction,  i.e., 

N 

(6)  fpHF  =  ■ 

i= 1 

Any  approximation  ^  to  the  wavefunction  ^  can  be  substituted  into 

m 

to  obtain  an  upper  bound  on  the  lowest  value  of  A  that  solves  (5).  Substituting  (6) 
into  (7),  one  can  derive  a  system  of  equations  for  (j)t  to  minimize  (7).  The  resulting 
i/’hf  will  best  approximate  ip,  in  the  sense  of  providing  the  best  estimate  (7). 

To  improve  upon  HF,  it  is  natural  to  consider  a  sum  of  products 

r  JV 

(8)  V’(r)  =  ■ 

1=1  i=l 

The  coefficients  si  are  not  strictly  necessary,  but  they  allow  us  to  assume  ||^-||  =  1. 
Many  methods  are  based  on  this  form,  and  the  distinction  is  in  how  they  use  it. 
The  Configuration  Interaction  (Cl)  method  (see  e.g.  [27])  chooses  the  functions  (p\ 
from  a  preselected  master  set  of  orthogonal  functions  and  decides  on  a  large  number 
r  of  combinations  to  consider,  based  on  excitation  level.  Substituting  (8)  into  (7) 
leads  to  a  matrix  eigenvalue  problem  that  can  be  solved  for  the  scalar  coefficients 
Si .  The  Multi-Configuration  Self-Consistent  Field  (MCSCF)  method  (e.g.  [15,  9]), 
chooses  a  pattern  of  excitations  similar  to  Cl,  but  then  solves  for  the  master  set 
of  orthogonal  functions  as  well  as  the  scalar  coefficients.  Many  variations  and 
combinations  of  these  methods  have  been  developed,  and  indeed  there  is  a  whole 
industry  in  producing  them. 

We  demonstrate  a  method  that  also  uses  a  wavefunction  of  the  form  (8)  but 
without  constraints  such  as  orthogonality  on  the  <p\ .  By  removing  these  constraints 
we  produce  much  better  approximations  at  much  smaller  r  than  existing  methods 
allow.  In  another  context  [5]  we  have  given  examples  where  removing  constraints 
produces  expansions  that  are  exponentially  more  efficient,  i.e.  r  =  N  instead  of  2N 
or  r  =  log  N  instead  of  N.  For  example,  in  our  approach  we  can  have  a  two-term 
representation 

(pihi)(p2(l2)  ■  ■  ■  4>n(1n) 

(9)  +c  [</>i(7i)  +  0jv+i(7i)] [02(72)  +  ^+2(72)]  •"  [<Pn(7n)  +  4>2n{1n)\  , 

where  {<Pj}2j=\  form  an  orthonormal  set.  To  represent  (9)  while  requiring  all  factors 
to  come  from  a  master  orthogonal  set  would  force  one  to  multiply  out  the  second 
term  and  thus  obtain  a  representation  with  2N  terms.  It  is  common  sense  that 
removal  of  constraints  could  produce  better  results,  and  steps  in  that  direction 
have  been  taken  (e.g. [26,  1,  14,  11,  12,  31,  25]).  These  works,  however,  were  only 
able  to  partially  remove  the  constraints,  and  so,  we  claim,  did  not  achieve  the  full 
potential. 

We  will  use  a  Green’s  function  iteration  to  move  a  trial  wavefunction  toward  the 
minimum  of  (7)  without  using  (7)  directly.  This  iteration  was  introduced  in  [22,  21] 
and  used  in  e.g.  [19].  Define  the  Green’s  function 

(10)  G„  =  {T-p 1)~\ 
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for  /i  <  0.  The  Green’s  function  iteration  is 

9n  =  -G„n[(V  +  W)fn] 

(11)  Mn+1  =  9n-((V  +  W)fnJn~9u)/\\gn\\2 

fn+ 1  =  Sn/llfl'nll- 

The  Green’s  function  iteration  is  essentially  an  inverse  power  method.  The  conver¬ 
gence  rate  is  only  linear,  but  if  the  initial  g  can  be  chosen  near  to  but  less  than 
the  lowest  eigenvalue,  then  the  error  will  decrease  by  a  substantial  fraction  at  each 
iteration,  and  not  many  iterations  will  be  needed.  We  use  /  to  denote  the  number 
of  Green’s  function  iterations  needed. 

We  use  approximate  wavefunctions  of  the  form  (8),  with  r  fixed.  The  iteration 

(11)  does  not  directly  produce  an  approximation  of  the  same  form,  so  we  modify  it 
by  defining  gn  to  be  the  function  of  the  form  (8)  that  minimizes 

(12)  ll5n-(-^„[(V  +  W)/„])||. 

In  order  to  assure  convergence  to  an  antisymmetric  solution,  we  use  the  pseudo¬ 
norm  induced  by  the  pseudo  inner  product  (-,-)a  =  (*4(-), »4(-)),  as  we  did  in  [5]. 
Constructing  gn  is  the  most  challenging  part  of  the  method,  and  requires  the  bulk 
of  our  effort.  To  simplify  notation,  we  now  suppress  the  iteration  index  n  and  set 
4>  =  fn  and  i>  =  gn. 

We  begin  with  some  approximation  (such  as  if)  itself)  and  will  iteratively 
improve  it.  The  outermost  loop  of  our  iteration  is  simply  to  repeat  our  refinement 
until  it  appears  that  has  converged.  For  the  computational  cost  estimates  we 
denote  the  number  of  repetitions  by  I\ .  To  refine  our  representation  we  loop  through 
the  variables  (electrons) .  The  functions  in  variables  other  than  the  current  variable 
are  fixed,  and  the  functions  in  the  current  variable  will  be  modified  to  minimize  the 
overall  error  \\ip  —  i/’IU-  This  Alternating  Least-Squares  (ALS)  approach  is  well- 
known  (see  e.g.  [20,  23,  24,  8,  10,  28]).  We  will  alternate  through  the  directions, 
but  for  ease  of  exposition  we  describe  the  k  =  1  case.  So,  4>lk  is  fixed  for  k  >  1,  and 
we  will  solve  for  the  values  of  (j)[  for  all  l. 

To  refine  in  the  current  variable,  we  set  up  and  solve  a  linear  least-squares 
problem.  The  normal  equations  for  a  least-squares  problem  are  derived  by  taking  a 
gradient  with  respect  to  the  free  parameters  and  setting  this  equal  to  zero.  As  long 
as  i/>  is  linear  and  not  degenerate  in  these  parameters,  the  resulting  equations  are 
linear  and  have  a  unique  solution.  Usually  these  free  parameters  are  coefficients  of 
the  representation  in  some  basis.  We  instead  take  the  parameters  to  be  the  point 
values  of  our  functions  (p\ .  or,  formally,  as  the  coefficients  of  the  point  evaluation 
functional  (7).  The  formulas  that  we  derive  can  be  used  with  a  fixed  basis,  but  are 
stated  independent  of  the  basis.  We  still  obtain  linear  normal  equations 

(13)  Ax  =  b  , 

but  now  b(l )  is  a  function  of  7,  x(l')  is  a  function  of  7',  and  A{l,V)  is  an  integral 
operator  mapping  functions  of  7'  to  functions  of  7.  The  kernel  of  A  is  defined  by 

IN  N 

(14)  a(u')(  7,y) = /  (7/>n^,’^n^ 

\  i= 2  i—2 
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where  the  point  evaluation  functionals  are  acting  in  the  i  =  1  direction.  The 
functions  in  b  are  defined  by 

r  j  N  N 

(15)  6(0(7)  =  E  (  -Sm[V  +  W]  n  C,  <7)  n  # 

m  \  i—  1  z—2 

Once  A  and  b  have  been  constructed,  we  will  apply  the  Conjugate  Gradient  iterative 
method  (see  e.g.  [17])  to  solve  (13).  We  initialize  with  r  =  b  —  Ax,  v  =  r,  and 
c  =  (r,  r),  and  then  the  core  of  the  method  is  the  sequence  of  assignments  z  <—  Av, 
t  <—  c/(v,z),  x  4—  x  +  fv,  r  4—  r  —  tz,  d  <—  (r,  r),  v  4—  r+(d/c)v,  and  c  <—  d,  applied 
iteratively.  We  use  S  to  denote  the  number  of  conjugate  gradient  iterations  needed. 
Thus  x  is  constructed  using  only  matrix-vector  products  and  vector  additions,  all 
which  are  compatible  with  our  formulation  with  integral  operators.  The  conjugate 
gradient  method  applies  only  to  positive-definite  operators.  Our  operator  A  is  only 
semi-definite  due  to  the  null-space  in  the  antisymmetric  pseudo-norm.  Fortunately, 
b  was  computed  with  the  same  pseudo-norm  and  has  no  component  in  the  null- 
space  of  A. 

One  advantage  of  using  this  iterative  method  with  integral  operators  is  that  our 
algorithm  is  “basis-free”.  The  representation  of  x  can  naturally  be  adaptive  in  7, 
for  example  refining  near  the  nuclei  as  indicated  by  the  refinement  in  b.  For  the 
estimates  of  computational  cost,  we  use  M  to  denote  the  cost  to  represent  a  function 
of  7,  or  integrate  such  a  function.  The  antisymmetry  constraint  requires  N  <  M, 
and  in  general  we  expect  M  to  be  much  larger  than  N.  For  our  numerical  results, 
we  use  adaptive  polynomial  multiwavelets,  following  [18,  19].  In  those  works  it  was 
shown  that  this  basis  effectively  eliminates  basis-set  error  within  HF. 

We  are  left  with  the  problem  of  how  to  construct  A  in  (14)  and  b  in  (15).  We 
have  develop  the  machinery  and  algorithms  for  computing  these  antisymmetric 
inner  products.  Our  formulation  uses  low-rank  perturbations  of  matrices,  thus 
avoiding  co- factor  expansions. 

The  computational  cost  for  the  whole  method  is  acceptable.  As  noted  above,  the 
cost  depends  on  N,  r,  M,  /,  K,  and  S.  Although  S  in  theory  could  be  as  many  as 
rM,  we  have  a  very  good  starting  point,  and  so  expect  only  a  very  small  constant 
number  to  be  needed.  We  use  MlogM  to  denote  the  cost  to  convolve  a  function 
of  7  with  1/ 1| r  1 1 .  Some  Poisson  solvers  achieve  this  complexity,  but  this  cost  may 
vary  with  the  choice  of  basis.  We  use  L  to  denote  the  number  of  terms  used  to 
approximate  the  Green’s  function  with  Gaussians.  The  final  computational  cost  is 
then 

(16)  0(KIr2  N2  [L(N  +  MlogM)  +  S{N  +  M)}). 

For  comparison,  the  cost  to  evaluate  a  single  instance  of  Lowdin’s  rules  is  0(N2(N+ 
M)).  The  size  r  needed  in  practice,  and  how  it  depends  on  the  various  parameters 
in  the  problem,  is  still  an  open  question. 

We  have  verified  our  approach  by  computing  electron  structure  of  a  few  elements 
and  are  in  the  process  of  accelerating  the  algorithm  and  verifying  its  performance 
against  methods  currently  used  in  quantum  chemistry.  Further  details  may  be 
found  in  the  paper  by  Gregory  Beylkin,  Martin  J.  Mohlenkamp  and  Fernando  Perez 
“Approximating  a  Wavefunction  as  an  Unconstrained  Sum  of  Slater  Determinants” 
[6]  submitted  for  publication  to  Journal  of  Mathematical  Physics.  The  paper  is 
attached  to  this  report. 
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Abstract 

We  present  a  fast,  adaptive  multiresolution  algorithm  for  applying  integral  operators 
with  a  wide  class  of  radially  symmetric  kernels  in  dimensions  one,  two  and  three. 
This  algorithm  is  made  efficient  by  the  use  of  separated  representations  of  the  kernel. 
We  discuss  operators  of  the  class  (—A  +  p2I)~a,  where  p  >  0  and  0  <  a  <  3/2, 
and  illustrate  the  algorithm  for  the  Poisson  and  Schrodinger  equations  in  dimension 
three.  The  same  algorithm  may  be  used  for  all  operators  with  radially  symmetric 
kernels  approximated  as  a  weighted  sum  of  Gaussians,  making  it  applicable  across 
multiple  fields  by  reusing  a  single  implementation. 

This  fast  algorithm  provides  controllable  accuracy  at  a  reasonable  cost,  compa¬ 
rable  to  that  of  the  Fast  Multipole  Method  (FMM).  It  differs  from  the  FMM  by 
the  type  of  approximation  used  to  represent  kernels  and  has  an  advantage  of  being 
easily  extendable  to  higher  dimensions. 


Key  words:  Separated  representation:  multiwavelets;  adaptive  algorithms;  integral 
operators. 


1  Introduction 


For  a  number  of  years,  the  Fast  Multipole  Method  (FMM)  [1,2,3]  has  been 
the  method  of  choice  for  applying  integral  operators  to  functions  in  dimen¬ 
sions  d  <  3.  On  the  other  hand,  multiresolution  algorithms  in  wavelet  and 
multiwavelet  bases  introduced  in  [4]  for  the  same  purpose  were  not  efficient 
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enough  to  be  practical  in  more  than  one  dimension.  Recently,  with  the  in¬ 
troduction  of  separated  representations  [5,6,7],  practical  multiresolution  algo¬ 
rithms  in  higher  dimensions  [8,9,10,11]  became  available  as  well.  In  this  paper 
we  present  a  new  fast,  adaptive  algorithm  for  applying  a  class  of  integral  op¬ 
erators  with  radial  kernels  in  dimensions  d  =  1,2,3,  and  we  briefly  discuss  its 
extension  to  higher  dimensions. 

In  physics,  chemistry  and  other  applied  fields,  many  important  problems  may 
be  formulated  using  integral  equations,  typically  involving  Green’s  functions 
as  their  kernels.  Often  such  formulations  are  preferable  to  those  via  partial 
differential  equations  (PDEs).  For  example,  evaluating  the  integral  expressing 
the  solution  of  the  Poisson  equation  in  free  space  (the  convolution  of  the 
Green’s  function  with  the  mass  or  charge  density)  avoids  issues  associated  with 
the  high  condition  number  of  a  PDE  formulation.  Integral  operators  appear 
in  fields  as  diverse  as  electrostatics,  quantum  chemistry,  fluid  dynamics  and 
geodesy;  in  all  such  applications  fast  and  accurate  methods  for  evaluating 
operators  on  functions  are  needed. 


The  FMM  and  our  approach  both  employ  approximate  representations  of 
operators  to  yield  fast  algorithms.  The  main  difference  lies  in  the  type  of 
approximations  that  are  used.  For  example,  for  the  Poisson  kernel  1  jr  in 
dimension  d  =  3,  the  FMM  [3]  uses  a  plane  wave  approximation  starting  from 
the  integral 
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where  r  =  \J(x  —  x0)2  +  (y  —  y0)2  +  (z  —  z0)2.  The  elegant  approximation  de¬ 
rived  from  this  integral  in  [3]  is  valid  within  a  solid  angle,  and  thus  requires 
splitting  the  application  of  an  operator  into  directional  regions;  the  number 
of  such  regions  grows  exponentially  with  dimension.  For  the  same  kernel,  our 
approach  starts  with  the  integral 
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and  its  discretization  with  finite  accuracy  e  yields  a  spherically  symmetric 
approximation  as  a  weighted  sum  of  gaussians.  Other  radial  kernels  can  be 
similarly  treated  by  a  suitable  choice  of  integrals.  The  result  is  a  separated 
representation  of  kernels  and,  therefore,  an  immediate  reduction  in  the  cost 
of  their  application.  This  difference  in  the  choice  of  approximation  dictates 
the  differences  in  the  corresponding  algorithms.  In  dimension  d  <  3  both  ap¬ 
proaches  are  practical  and  yield  comparable  performance.  The  key  advantage 
of  our  approach  is  its  straightforward  extensibility  to  higher  dimensions  [6,7]. 


Given  an  arbitrary  accuracy  e,  we  effectively  represent  kernels  by  a  set  of  ex¬ 
ponents  and  weights  describing  the  terms  of  the  gaussian  approximation  of 
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integrals  like  in  (2).  The  number  of  terms  in  such  sum  is  roughly  proportional 
to  log(e_1),  or  a  low  power  of  log(e^1),  depending  on  the  operator.  Since  op¬ 
erators  are  fully  described  up  to  an  accuracy  e  by  the  exponents  and  weights 
of  the  sum  of  gaussians,  a  single  algorithm  applies  all  such  operators.  These 
include  operators  such  as  (—A  +  p2I)~a,  where  /i  >  0  and  0  <  a  <  3/2, 
and  certain  singular  operators  such  as  the  projector  on  divergence-free  func¬ 
tions.  Since  many  physically  significant  operators  depend  only  on  the  distance 
between  interacting  objects,  our  approach  is  directly  applicable  to  problems 
involving  a  wide  class  of  operators  with  radial  kernels. 

We  combine  separated  and  multiresolution  representations  of  kernels  and  use 
multiwavelet  bases  [12]  that  provide  inter  alia  a  method  for  discretizing  inte¬ 
gral  equations,  as  is  the  case  in  quantum  chemistry  [8,9,11,10].  This  choice  of 
multiresolution  bases  accommodates  integral  and  differential  operators  as  well 
as  a  wide  variety  of  boundary  conditions,  without  degrading  the  order  of  the 
method  [13,14],  Multiwavelet  bases  retain  the  key  desired  properties  of  wavelet 
bases,  such  as  vanishing  moments,  orthogonality,  and  compact  support.  Due  to 
the  vanishing  moments,  wide  classes  of  integro-differential  operators  have  an 
effectively  sparse  matrix  representation,  i.e.,  they  differ  from  a  sparse  matrix 
by  an  operator  with  small  norm.  Some  of  the  basis  functions  of  multiwavelet 
bases  are  discontinuous,  similar  to  those  of  the  Haar  basis  and  in  contrast  to 
wavelets  with  regularity  (see  e.g.  [15,16]).  The  usual  choices  of  scaling  func¬ 
tions  for  multiwavelet  bases  are  either  the  scaled  Legendre  or  interpolating 
polynomials.  Since  these  are  also  used  in  the  discontinuous  Galerkin  and  dis¬ 
continuous  spectral  elements  methods,  our  approach  may  also  be  seen  as  an 
adaptive  extension  of  these  methods. 

The  algorithm  for  applying  an  operator  to  a  function  starts  with  computing 
its  adaptive  representation  in  a  multiwavelet  basis,  resulting  in  a  2d-tree  with 
blocks  of  coefficients  at  the  leaves.  Then  the  algorithm  adaptively  applies  the 
(modified)  separated  non-standard  form  [4]  of  the  operator  to  the  function  by 
using  only  the  necessary  blocks  as  dictated  by  the  function’s  tree  represen¬ 
tation.  We  note  that  in  higher  dimensions,  d  3,  functions  need  to  be  in 
a  separated  representation  as  well,  since  the  usual  constructions  via  bases  or 
grids  are  prohibitive  (see  [6,7]). 

We  start  in  Section  2  by  recalling  the  basic  notions  of  multiresolution  analy¬ 
sis,  non-standard  operator  form  and  adaptive  representation  of  functions  un¬ 
derlying  our  development.  We  then  consider  the  separated  representation  for 
radially  symmetric  kernels  in  Section  3,  and  use  it  to  efficiently  extend  the 
modified  ns-form  to  multiple  spatial  dimensions  in  Section  4.  We  pay  partic¬ 
ular  attention  to  computing  the  band  structure  of  the  operator  based  on  one 
dimensional  information.  We  use  this  construction  in  Section  5  to  introduce 
the  adaptive  algorithm  for  application  of  multidimensional  operators  in  the 
modified  ns-form,  and  illustrate  its  performance  in  Section  6.  We  consider 
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two  examples:  the  Poisson  equation  in  free  space  and  the  ground  state  of  the 
Hydrogen  atom.  We  conclude  with  a  brief  discussion  in  Section  7. 


2  Preliminary  considerations 


This  section  and  Appendix  are  provided  for  the  convenience  of  the  reader  in 
order  to  keep  this  paper  reasonably  self-contained.  We  provide  background 
material  and  introduce  necessary  notation. 

The  essence  of  our  approach  is  to  decompose  the  operator  using  projectors 
on  a  Multiresolution  Analysis  (MRA),  and  to  efficiently  apply  its  projections 
using  a  separated  representation.  We  use  the  decomposition  of  the  operator 
into  the  ns-form  [4],  but  we  organize  it  differently  (thus,  modified  ns-form)  to 
achieve  greater  efficiency.  This  modification  becomes  important  as  we  extend 
this  algorithm  to  higher  dimensions. 

In  this  section  we  introduce  notation  for  MRA,  describe  the  adaptive  repre¬ 
sentation  of  functions  and  associated  data  structures,  introduce  the  modified 
ns-form  and  an  algorithm  for  its  adaptive  application  in  dimension  d  =  1  as 
background  material  for  the  multidimensional  case. 


2.1  Multiresolution  analysis 


Let  us  consider  the  multiresolution  analysis  as  a  decomposition  of  L2([0,l]d) 
into  a  chain  of  subspaces 


Vo  c  Vi  c  v2  c  •  •  •  c  vn  c  . . . , 

so  that  L2([ 0,  l]d)  =  U-qV,,  We  note  that  our  indexing  of  subspaces  (increas¬ 
ing  towards  finer  scales)  follows  that  in  [13],  and  is  the  reverse  of  that  in  [4,15]. 
On  each  subspace  Vj,  we  use  the  tensor  product  basis  of  scaling  functions  ob¬ 
tained  using  the  functions  (jfifix)  (k  =  0, ...  ,p  —  1)  which  we  briefly  describe 
in  Appendix. 

The  wavelet  subspaces  W j  are  defined  as  the  orthogonal  complements  of  V j 
in  Vj_i,  thus 

vn  =  v0  ®;l0  w3-  . 

Introducing  the  orthogonal  projector  on  Vj,  P j  :  L2([0,l]d)  — >  Vj  and  con¬ 
sidering  an  operator  T  :  L2([0,l]d)  — >  L2([0,l]d),  we  define  its  projection 
Tj  :  Vj  — >  Vj  as  Tj  =  PjTPj.  We  also  consider  the  orthogonal  projector 
Q j  :  L2{[ 0,  l]d)  — »  Wj,  defined  as  Q j  =  Pj+i  —  Pj. 
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2.2  Adaptive  representation  of  functions 


Let  us  describe  an  adaptive  refinement  strategy  for  construction  multiresolu¬ 
tion  representations  of  functions  /  :  B  — >  B,  where  B  =  [0, 1] We  proceed 
by  recursive  binary  subdivision  of  the  box  B ,  so  the  basic  structure  represent¬ 
ing  our  functions  is  a  2d— tree  with  arrays  of  coefficients  stored  at  the  leaves 
(terminal  nodes)  and  no  data  stored  on  internal  nodes.  On  each  box  obtained 
via  this  subdivision,  our  basis  is  a  tensor  product  of  orthogonal  polynomials  of 
degree  k  —  0, . . .  ,p  —  1  in  each  variable,  as  described  in  Appendix  8.  Therefore, 
the  leaves  carry  d-dimensional  arrays  of  pd  coefficients  which  may  be  used  to 
approximate  function  values  anywhere  in  the  box  corresponding  to  the  spatial 
region  covered  by  it,  via  (30)  or  its  equivalent  for  higher  values  of  d.  For  con¬ 
ciseness,  we  will  often  refer  to  these  d-dimensional  arrays  of  coefficients  stored 
at  tree  nodes  as  function  blocks. 

This  adaptive  function  decomposition  algorithm  is  similar  to  that  used  in  [17]. 
Such  construction  formally  works  in  any  dimension  d.  However,  since  its  com¬ 
plexity  scales  exponentially  with  d,  its  practical  use  is  restricted  to  fairly  low 
dimension,  e.g.  d  <  4.  In  higher  dimensions,  alternate  representation  strate¬ 
gies  for  functions  such  as  [6,7]  should  be  considered.  In  high  dimensions,  these 
strategies  deal  with  the  exponential  growth  of  complexity  by  using  controlled 
approximations  that  have  linear  cost  in  d. 


For  simplicity,  we  will  describe  the  procedure  for  the  one-dimensional  case 
since  the  extension  to  dimensions  d  =  2,  3  is  straightforward.  Since  we  can 
not  afford  to  construct  our  representation  by  starting  from  a  fine  scale  (es¬ 
pecially  in  d  =  3),  we  proceed  by  successive  refinements  of  an  initial  coarse 
sampling.  This  approach  may  result  in  a  situation  where  the  initial  sampling 
is  insufficient  to  resolve  a  rapid  change  in  a  small  volume;  however,  in  practical 
applications  such  situations  are  rare  and  may  be  avoided  by  an  appropriate 
choice  of  the  initial  sampling  scale. 


Let  Bj  =  [2  H,  2  i(l  +  1)],  l  —  0, . . . ,  2J  —  1,  represent  a  binary  subinterval  on 
scale  j.  We  denote  by  //  =  {//(.x'fc)}^  the  vector  of  values  of  the  function 


/  on  the  Gaussian  nodes  in  Bj.  From  these  values  we  compute  the  coeffi¬ 
cients  {4}Pfc  0  (see  (32)  in  Appendix)  and  interpolate  f(x)  for  any  x  E  Bj 
by  using  (30).  We  then  subdivide  Bj  into  two  child  intervals,  Bfjf1  and  Bjjff , 
and  evaluate  the  function  /  on  the  Gaussian  nodes  in  Bf^1  and  B%jf\.  We 


then  interpolate  /  by  using  the  coefficients  from  their  parent  inter¬ 


val  and  denote  by  f.2l  and  fjf  ]  the  vectors  of  interpolated  values  on  the 
two  subintervals.  Now,  if  for  a  given  tolerance  e  either  /^+1  —  /; 


fj+ 1 
' 21 


>  e  or 


fi 


j+ 1 

2Z+1 


fj+ 1 
J21+1 


>  e,  we  repeat  the  process  recursively  for  both  subintervals, 
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£>^+1  and  otherwise,  we  keep  the  coefficients  js^}^  to  represent  the 

function  on  the  entire  interval  Bj.  This  interval  then  becomes  a  leaf  in  our 
tree. 


At  this  stage  we  use  the  £°°  norm 
the  original  function  /  such  that 


thus  constructing  an  approximation  /  to 
<  e,  which  immediately  implies 


J-f 

that  /  —  /  <  e.  This  estimate  clearly  extends  to  any  dimension.  Once 

the  approximation  with  £°°  norm  is  constructed,  the  corresponding  tree  may 
be  pruned  if  an  application  only  requires  the  approximation  to  be  valid  in 
the  £2  norm.  We  start  this  process  on  the  finest  scale  and  simply  remove  all 
blocks  whose  cumulative  contribution  is  below  e.  Other  norms,  such  as  Hi, 
can  be  accommodated  by  appropriately  weighing  the  error  tolerance  with  a 
scale-dependent  factor  in  the  initial  (coarse  to  fine)  decomposition  process. 


The  complete  decomposition  algorithm  proceeds  by  following  the  above  recipe, 
starting  with  an  initial  coarse  scale  (typically  j  =  0)  and  continuing  recursively 
until  the  stopping  criterion  is  met  for  all  subintervals.  In  practice,  we  choose  a 
stopping  scale  jmax,  beyond  which  the  algorithm  will  not  attempt  to  subdivide 
any  further.  Reaching  jmax  means  that  the  function  has  significant  variations 
which  are  not  accurately  resolved  over  an  interval  of  width  2_:,max  using  a  basis 
of  order  p.  A  pseudo-code  listing  of  this  process  is  presented  as  Algorithm  1. 


Algorithm  1  Adaptive  Function  Decomposition. 


Start  at  a  coarse  scale,  typically  j  =  0. 

Recursively,  for  all  boxes  bJ  on  scale  j,  proceed  as  follows: 

Construct  the  list  C  of  2d  child  boxes  tP+l  on  scale  j  +  1. 

Compute  the  values  of  the  function  f{V)  at  the  pd  Gauss-Legendre  quadra¬ 
ture  nodes  in  V . 
for  all  boxes  V+l  e  C  do 

From  f(V),  interpolate  to  the  Gauss-Legendre  quadrature  nodes  in  b3+l , 
producing  values  /(frJ+1). 

Compute  the  values  of  f{V+l)  at  the  Gauss-Legendre  nodes  of  fd+1,  by 
direct  evaluation. 


if 


f{V+l)  -  f(V+1) 


>  e  then 


Recursively  repeat  the  entire  process  for  all  boxes  6J+ 1  e  C . 

end  if 
end  for 

Geting  here  means  that  the  interpolation  from  the  parent  was  successful 
for  all  child  boxes.  We  store  the  parent’s  coefficients  from  id  in  the  function 
tree. 
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(b)  Pointwise  error. 


Nnod  =  8,  e  =  l.Oe  -  04,  Nbloek,  =  33 


(d)  Redundant  tree 


Figure  1.  The  function  f(x)  =  sin(167rx6)  shown  in  (a),  is  decomposed  with  p  =  8 
and  e  =  ICR4;  (b)  shows  the  pointwise  approximation  error,  (c)  is  the  resulting  adap¬ 
tive  tree,  where  smaller  subdivisions  are  required  in  regions  with  higher  frequency 
content,  (d)  is  the  redundant  tree  associated  with  this  adaptive  decomposition,  where 
all  internal  nodes  have  been  filled  with  data. 

2.2.1  Tree  structures  for  representing  functions 


The  decomposition  Algorithm  1  naturally  leads  to  a  tree  data  structure  to 
represent  functions,  with  the  leaves  of  the  tree  corresponding  to  the  spatial 
intervals  over  which  the  multiwavelet  basis  provides  a  sufficiently  accurate 
local  approximation.  By  using  (30)  or  its  higher- dimensional  extensions,  the 
only  data  needed  to  approximate  f(x)  anywhere  in  B  is  the  array  of  basis 
coefficients  on  these  leaves.  Thus  we  use  a  tree  structure  where  the  leaves 
store  these  coefficients  and  the  internal  nodes  do  not  contain  any  data  (and 
are  effectively  removed  since  we  use  hash  tables  for  storage).  We  will  refer 
to  this  structure  as  an  adaptive  tree.  Each  level  in  the  tree  corresponds  to  a 
scale  in  the  MRA,  with  the  root  node  corresponding  to  the  coarsest  projection 
fo  ^  V0. 


Now,  in  order  to  apply  the  modified  non-standard  form  of  an  operator  to  a 
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function,  we  will  show  in  the  next  section  that  we  also  need  the  basis  coeffi¬ 
cients  corresponding  to  internal  nodes  of  the  tree.  Hence,  from  the  adaptive 
tree  data  structure  we  will  compute  a  similar  tree  but  where  we  do  keep  the 
coefficients  of  scaling  functions  on  all  nodes  (leaves  and  internal). 

The  coefficients  on  the  internal  nodes  are  redundant  since  they  are  computed 
from  the  function  blocks  stored  in  the  leaves.  We  will  thus  refer  to  the  tree 
containing  coefficients  on  all  nodes  as  a  redundant  tree.  It  is  constructed  re¬ 
cursively  starting  from  the  leaves,  by  projecting  the  scaling  coefficients  from 
all  sibling  nodes  onto  their  parent  node,  using  the  decomposition  (41). 

Figure  1  shows  both  the  adaptive  and  the  redundant  trees  for  a  sample  func¬ 
tion.  This  figure  displays  the  coarsest  scales  at  the  bottom  and  progressively 
finer  ones  further  up,  with  filled  boxes  representing  nodes  where  scaling  co¬ 
efficients  are  stored  and  empty  boxes  indicating  nodes  with  no  data  in  them 
(these  do  not  need  to  be  actually  stored  in  the  implementation). 


2.3  Modified  ns- form 


The  non-standard  form  [4]  (see  also  [13]  for  the  version  specialized  for  multi¬ 
wavelets)  of  the  operator  T  is  the  collection  of  components  of  the  telescopic 
expansion 

n— 1 

T n  =  (Tn  —  Tn_i)  +  (Tn_i  —  Tn_2)  +  ■  •  •  +  To  =  To  +  ^  (Aj  +  B;  +  Cj),  (3) 

3=0 

where  A j  =  QjTQj,  B ;  =  QjTPj,  and  Cj  =  PjTQj.  The  main  property 
of  this  expansion  is  that  the  rate  of  decay  of  the  matrix  elements  of  the  op¬ 
erators  A  j,  B  j  and  Cj  away  from  the  diagonal  is  controlled  by  the  number 
of  vanishing  moments  of  the  basis  and,  for  a  finite  but  arbitrary  accuracy  e, 
the  matrix  elements  outside  a  certain  band  can  be  set  to  zero  resulting  in  an 
error  of  the  norm  less  than  e.  Such  behavior  of  the  matrix  elements  becomes 
clear  if  we  observe  that  the  derivatives  of  kernels  of  Calderon- Zygmund  and 
pseudo- differential  operators  decay  faster  than  the  kernel  itself.  If  we  use  the 
Taylor  expansion  of  the  kernel  to  estimate  the  matrix  elements  away  from  the 
diagonal,  then  the  size  of  these  elements  is  controlled  by  a  high  derivative  of 
the  kernel  since  the  vanishing  moments  eliminate  the  lower  order  terms  [4], 
We  note  that  for  periodic  kernels  the  band  is  measured  as  a  periodic  distance 
from  the  diagonal,  resulting  in  hlled-in  ‘corners’  of  a  matrix  representation. 

Let  us  introduce  notation  to  show  how  the  telescopic  expansion  (3)  is  used 
when  applying  an  operator  to  a  function.  If  we  apply  the  projection  of  the 
operator  Tj_x  not  on  its  “natural  scale”  j  —  1,  but  on  the  finer  scale  j ,  we 
denote  its  upsampled  version  as  j  (Tj_i).  In  the  matrix  representation  of 


TJ_1 ,  this  operation  results  in  the  doubling  of  the  matrix  size  in  each  direction. 
This  upsampling  f  (•)  and  downsampling  j  (•)  notation  will  also  be  used  for 
projections  of  functions. 

With  this  notation,  computing  g  =  T /  via  (3)  splits  across  scales, 


90  —  T0/o 

91  =  [Ti-  T  (T0)]/! 

92  =  [T2—  T  (T!)]/2  (4) 


9s  =  [T~  T  (T, _!)]/, 


where  = 

As  in  the  application  of  the  usual  ns-form  in  [4],  to  obtain  gn  after  building 
the  set  {g0,  9i,  ■  ■  ■ ,  9n },  we  have  to  compute 

9n  =  9n+  T  (<?n— 1+  T  (<7n- 2+  T  (fin- 3  +  ■  ■  ■  +  (T  %)  ■  ■  •)))  ■  (5) 

The  order  of  the  parentheses  in  this  expression  is  essential,  as  it  indicates  the 
order  of  the  actual  operations  which  are  performed  starting  on  the  coarsest 
subspace  Vo-  For  example,  if  the  number  of  scales  n  —  4,  then  (5)  yields 
<74  =  (?4+  t  {93+  T  {92+  T  (f)\  +  (T  Po)))))  describing  the  sequence  of  necessary 
operations. 

Unfortunately,  the  sparsity  of  the  non-standard  form  induced  by  the  vanishing 
moments  of  bases  is  not  sufficient  for  fast  practical  algorithms  in  dimensions 
other  than  d  —  1.  For  algorithms  in  higher  dimensions,  we  need  an  additional 
structure  for  the  remaining  non-zero  coefficients  of  the  representation.  We  will 
use  separated  representations  (see  Section  3)  introduced  in  [6,18]  and  first 
applied  in  a  multiresolution  setting  in  [8,9,10,11].  Within  the  retained  bands, 
the  components  of  the  non-standard  form  are  stored  and  applied  in  a  separated 
representation  and,  as  a  result,  the  numerical  application  of  operators  becomes 
efficient  in  higher  dimensions. 


2.4  Modified  ns-form  in  ID 


Let  us  describe  a  one-dimensional  construction  for  operators  on  L2([ 0, 1])  to 
introduce  all  the  features  necessary  for  a  multidimensional  algorithm.  Since 
we  use  banded  versions  of  operators,  we  need  to  introduce  the  necessary  book¬ 
keeping. 
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The  “template”  for  the  band  structure  on  scale  j  comes  from  the  band  on  the 
previous  scale  j  —  1.  For  each  block  on  scale  j  —  1,  the  upsampling  operation 
|  (Tj_i)  creates  four  blocks  (all  combinations  of  even/odd  row  and  column 
indices).  We  insist  on  maintaining  the  strict  correspondence  between  these  four 
blocks  and  those  of  Tr  For  this  reason  the  description  of  the  retained  blocks 
of  T j  involves  the  parity  of  their  row  and  column  indices.  Let  us  denote  the 
blocks  in  the  matrix  representing  T j  by  tr'11  ,  where  1,1'  —  0  . . .  2jl .  Individual 
elements  within  these  blocks  are  indexed  as  t]f/  ,  where  i,i'  =  0, ...  ,p  —  1,  and 
p  is  the  order  of  the  multiwavelet  basis.  For  a  given  width  of  the  band  bj ,  we 
keep  the  operator  blocks  tr'U  with  indices  satisfying 


l  —  bj  +  1  <  l1  <  l  +  bj,  for  even  /, 
l  —  bj  <  l'  <  l  +  bj  —  1,  for  odd  l. 


(6) 


We  denote  the  banded  operators  where  we  keep  only  blocks  satisfying  (6)  as 
T/  and  j  (T;_i)L.  If  we  downsample  the  operator  j  (T?_i)b  back  to  its 
original  scale  j  —  1,  then  (6)  leads  to  the  band  described  by  the  condition 

l  ~  IA'/2J  </'</+  [bj/2\  ,  (7) 

where  [bj/2\  denotes  the  integer  part  of  bj/ 2.  We  denote  the  banded  operator 
on  scale  j  —  1  as  where  we  retain  blocks  satisfying  (7). 

If  we  now  rewrite  (4)  keeping  only  blocks  within  the  bands  on  each  scale,  we 
obtain 


£lo  —  T0/o 

9i  =  [T?1-  T  (T0)61]/!  =  T6l/i-  j  (T0)61/! 

g2  =  [T62-  T  (T!)62]/2  =  T6a/2-  T  (T !)6a/2  (8) 


9j  =  [T?-  T  (T j-^]fj  =  T )fj-  j  (Tj_i)bjfj 


For  any  arbitrary  but  hnite  accuracy,  instead  of  applying  the  full  |Tj  —  | 
( Tj _ ! ) ] ,  we  will  only  apply  its  banded  approximation. 

A  simple  but  important  observation  is  that 

i  (IT  (Tj-i)]/,-)  =  ,  (9) 

which  follows  from  the  fact  that  QjP?  —  p?  Q?  —  0)  since  these  are  orthogonal 
projections.  Thus,  we  observe  that  j  (|  (Tj_!)L)  =  T.-6:’^ ;  so  instead  of  ap¬ 
plying  |  (T j_i)bj  fj  on  scale  j,  we  can  obtain  the  same  result  using  (9),  so  that 
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Figure  2.  Modified  non-standard  form  of  the  convolution  operator  in  (11)  in  a  mul¬ 
tiwavelet  basis,  with  white  representing  0  and  black  representing  large  values.  The 
top  matrix  is  the  projection  of  this  operator  on  Vs,  resulting  in  a  dense  matrix.  The 
lower  half  depicts  the  multiresolution  representation  in  (10)  with  only  the  blocks 
that  are  actually  retained  for  a  given  accuracy.  We  will  call  the  leftmost  matrix  in 
this  series  a  whole  band  matrix  and  all  others  outer  band  matrices.  The  two  empty 
outer  band  matrices  on  scales  j  =  0, 1  are  explained  in  the  main  text. 

T  =|  Therefore,  we  will  compute  only  T/6^2-  /)■_]  on 

scale  j  —  1  and  combine  it  with  computing  ./)_  i  •  Incorporating  this  into 
(5),  we  arrive  at 


9n 


(10) 


Using  this  expression  yields  an  efficient  algorithm  for  applying  an  operator, 
as  on  each  scale  j,  —  t^7+1//2^  is  a  sparse  object,  due  to  the  cancellation 

which  occurs  for  most  of  the  blocks.  In  particular,  T^J  —  T^+1^2'  is  missing 
the  blocks  near  the  diagonal,  and  we  will  refer  to  it  as  an  outer  band  matrix. 
We  will  call  a  whole  band  matrix  as  it  contains  both  the  inner  and  outer 
bands. 
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The  structure  of  these  matrices  is  illustrated  in  Figure  2.  The  two  empty 
scales  j  —  0, 1  arise  due  to  the  complete  cancellation  of  blocks  on  these  scales. 
We  note  that  the  modified  non-standard  form  is  constructed  adaptively  in 
the  number  of  scales  necessary  for  a  given  function.  For  just  two  scales,  this 
construction  will  have  the  scale  j  =  1  non-empty.  Given  an  adaptive  decom¬ 
position  of  a  function  on  n  scales,  we  precompute  the  modified  non-standard 
form  (depicted  in  Figure  2  for  five  scales)  on  all  scales  n,n  —  1, . . . ,  1.  For 
matrices  requiring  2n  •  2n  blocks  on  the  finest  scale  n,  we  need  to  keep  and 
apply  only  0( 2n)  blocks,  as  with  the  original  non-standard  form  in  [4], 


2-4-1  Adaptive  application 

Let  us  show  how  to  use  the  multiscale  representation  in  (10)  to  apply  the 
operator  T  to  a  function  /  with  controlled  accuracy  e.  We  describe  an  adaptive 
application  of  the  operator  to  a  function,  where  we  assume  (as  is  often  the 
case)  that  the  tree  structure  of  the  input  is  sufficient  to  adequately  describe 
the  output  with  accuracy  e.  This  assumption  will  be  removed  later. 

Our  algorithm  uses  the  structure  shown  in  Figure  2  in  an  adaptive  fashion. 
We  copy  the  structure  of  the  redundant  tree  for  the  input  function,  and  use 
that  as  a  template  to  be  filled  for  the  output  g.  For  each  node  of  the  output 
tree  we  determine  whether  it  is  a  leaf  or  an  internal  node:  for  leaves,  we  must 
apply  a  whole  band  matrix  on  the  scale  of  that  node,  such  as  the  leftmost 
matrix  for  j  =  5  in  the  example  shown  in  Figure  2.  For  internal  nodes,  we 
apply  an  outer  band  matrix  (for  that  scale).  We  note  that  our  construction 
of  the  operator  produces  both  whole  and  outer  band  matrices  for  all  scales, 
and  we  simply  choose  the  appropriate  kind  for  each  node  of  output  as  needed. 
Upon  completion  of  this  process,  we  apply  the  projection  (5)  to  construct  the 
final  adaptive  tree  representing  the  output. 

Algorithm  2  returns  an  adaptive  tree  representing  the  function  g.  This  tree 
contains  sufficient  information  to  evaluate  g  at  arbitrary  points  by  interpola¬ 
tion  and  may  be  used  as  an  input  in  further  computations. 

We  note  that  Step  2  in  Algorithm  2  naturally  resolves  the  problem  that  is 
usually  addressed  by  mortar  methods,  see  e.g.  [19,20,21,22],  Since  adaptive 
representations  have  neighboring  blocks  of  different  sizes,  they  encounter  dif¬ 
ficulties  when  applying  non-diagonal  operators,  as  they  require  blocks  which 
do  not  exist  on  that  scale.  Our  approach  simply  constructs  these  as  needed 
and  caches  them  for  reuse,  without  requiring  any  additional  consideration  on 
the  part  of  the  user. 
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Algorithm  2  Adaptive  non-standard  form  operator  application  in  d  =  1, 
g  =  Tf _ 

Initialization:  Construct  the  redundant  tree  for  /  and  copy  it  as  skeleton 
tree  for  g  (see  Section  2.2.1). 
for  all  scales  j  —  0, . . . ,  n  —  1  do 

for  all  function  blocks  gj  in  the  tree  for  g  at  scale  j  do 

fj  Step  1.  Determine  the  list  of  all  contributing  blocks  of  the  modified 
ns-form  Tjv  ( see  Section  2.3): 
if  gj  belongs  to  a  leaf  then 

Read  operator  blocks  Tj),  from  row  /  of  whole  band  matrix  T bf . 

else 

Read  operator  blocks  T j,,  from  row  l  of  outer  band  matrix  T/  — 

rpfe  +  l/2] 

-Li 

end  if 

fj  Step  2.  Find  the  required  blocks  fj,  of  the  input  function  f : 
if  function  block  fj,  exists  in  the  redundant  tree  for  /  then 
Retrieve  it. 
else 

Create  it  by  interpolating  from  a  coarser  scale  and  cache  for  reuse. 

end  if 

fj  Step  3.  Output  function  block  computation: 

Compute  the  resulting  output  function  block  according  to  gj  = 
ffi'  Tjyfj,  where  the  operation  T 3u,fj  indicates  a  regular  matrix- vector 
multiplication. 

end  for 
end  for 

ff  Step  4-  Adaptive  projection: 

Project  resulting  output  function  blocks  gj  on  all  scales  into  a  proper  adap¬ 
tive  tree  by  using  Eq.  (5). 

Discard  from  the  resulting  tree  unnecessary  function  blocks  at  the  requested 
accuracy. 

Return:  the  function  g  represented  by  its  adaptive  tree. 


2-4-2  Numerical  example 

Let  us  briefly  illustrate  the  application  of  the  modified  ns-form  with  an  exam¬ 
ple  of  a  singular  convolution  on  the  unit  circle,  the  operator  with  the  kernel 
K(x)  =  cot(7rx), 


(Cf)(y)  =  p.v.  f  cot(7T (y-x))f(x)dx,  (11) 

Jo 

a  periodic  analogue  of  the  Hilbert  transform.  In  order  to  find  its  representation 
in  multiwavelet  bases,  we  compute 
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r{l!  =  2  j  J  K( 2  j (x  +  l))  &a>(x)  dx  =2  j  J  cot(7r2  j(x  +  /))  <hjj/(x)  dx  , 

(12) 

where  <lv(x),  i,i'  =  0, . . . ,  k  —  1  are  cross-correlation  functions  described 
in  Appendix  8.4  and  /  =  0,  ±1,  ±2, . . .  2J  —  1.  We  compute  rh/  using  the 
convergent  integrals 

rli'  =  2~j  £  4'/  (cot(vr2-J'(x  +  /))  +  (-1)i+i'cot(7r2-J(-x  +  0)) 

k=i' —i  •'O 

where  is  a  polynomial  described  in  Appendix  8.4.  In  our  numerical  ex¬ 
periment,  we  apply  (11)  to  the  periodic  function  on  [0, 1], 

f(x)  =  Yje-^x+k-l/2)\ 

k£7. 


which  yields 

(CfM  --V  e-“<»+t-1/2>2Era[VS(9+i-l/2)]  sign(n)<r”V/“e2™'' 

fcGZ  V  a  ngZ 

(!3) 

where  e  ^  Erh(?/)  =  ^  fy  es~  y  ds.  Expression  (13)  is  obtained  by  Erst  observ¬ 
ing  that  the  Hilbert  transform  of  e~ax 2  is  — e_ay2Er£(v/a|/),  and  then  evaluating 
the  lattice  sum,  noting  that  (see  [23,  formula  4.3.91]) 


cot(7rx) 


11  00 

;V  +  £ 


2x 


7 r  x 


k=  1 


Xc 


k 2' 


Table  1  summarizes  the  numerical  construction  of  this  solution  for  a  =  300, 
at  various  requested  precisions.  Optimal  performance  is  obtained  by  adjusting 
the  order  of  the  basis  p  as  a  function  of  the  requested  precision,  to  ensure  that 
the  operator  remains  a  banded  matrix  with  small  band,  and  that  the  adaptive 
representation  of  the  input  function  requires  a  moderate  number  of  scales. 
The  resulting  numerical  error  (as  compared  to  the  exact  analytical  solution), 
measured  in  the  i2  norm,  is  shown  in  the  last  column.  Figure  3  shows  the 
input  and  results  for  this  example,  as  well  as  the  point-wise  error  for  the  case 
where  ereq  =  10~12  and  p  —  14  (the  last  row  in  the  table). 


3  Separated  representations  of  integral  kernels 


The  approach  we’ve  discussed  so  far  does  not  efficiently  generalize  to  the 
application  of  non-separable  multidimensional  integral  kernels.  Since  several 


dx, 


14 


p 

Scales 

N blocks 

e 

E2 

5 

[2,3,4] 

8 

10~3 

1.5  •  10”4 

8 

[2,4,5] 

12 

1(T6 

1.3  •  10”7 

11 

[2,4,5] 

14 

10-9 

1.1  •  1CT10 

14 

[3,4,5] 

16 

10-12 

4.4-  10”13 

Table  1 

Results  from  evaluating  (13)  with  our  algorithm.  The  order  of  the  basis  p  is  adjusted 
as  a  function  of  the  requested  precision  e.  The  second  column  indicates  scales  present 
in  the  adaptive  tree  for  the  input.  The  third  column  shows  the  total  number  of 
blocks  of  coefficients  in  this  tree.  The  last  column  (£( 2)  shows  the  actual  error  of  the 
computed  solution  in  the  f2  norm. 


Figure  3.  Results  of  applying  the  cotangent  kernel  to  a  periodized  Gaussian  using 
basis  of  order  p  =  14  (the  last  row  in  Table  1).  The  pointwise  error  is  shown  on  the 
right  for  a  requested  accuracy  of  e  =  10~12. 

physically  important  kernels  belong  to  this  category  (e.g.  the  Poisson  kernels 
in  d  —  2  and  d= 3),  additional  tools  are  needed.  We  now  describe  the  key  idea 
that  allows  us  to  perform  this  generalization  to  d  >  1. 

We  use  the  separated  representation  of  operators  introduced  in  [6,24]  to  re¬ 
duce  the  computational  cost  of  the  straightforward  generalization  of  the  mul¬ 
tiresolution  approach  in  [4],  Such  representations  are  particularly  simple  for 
convolution  operators  and  are  based  on  approximating  kernels  by  a  sum  of 
Gaussians  [24,8,9,11,10].  This  approximation  has  a  multiresolution  character 
by  itself  and  requires  a  remarkably  small  number  of  terms.  In  fact,  our  algo¬ 
rithm  uses  the  coefficients  and  the  exponents  of  the  Gaussian  terms  as  the  only 
input  from  which  it  selects  the  necessary  terms,  scale  by  scale,  according  the 
desired  accuracy  threshold  e.  Therefore,  our  algorithm  works  for  all  operators 
with  kernels  that  admit  approximation  by  a  sum  of  Gaussians.  Examples  of 
such  operators  include  the  Poisson  and  the  bound  state  Helmholtz  Green’s 
functions,  the  projector  on  divergence  free  functions,  as  well  as  regular  and 
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fractional  derivative  operators.  Let  us  consider  a  particular  family  of  operators 
(—A  +  /i2/)-",  where  /i  >  0  and  0  <  a  <  3/2.  The  kernel  of  this  operator 

=  2-l+° .  C„  . 

where  K3_a  is  the  modihed  Bessel  function,  r  =  ||x  —  y||  and  Ca  —  2  • 
2_2a7T_^/r(a),  has  an  integral  representation 

A/t,Q(r)  =  C7a  /°°  e-r2e2se-^2e"2s+(3-2Q)sds.  (14) 

Using  the  trapezoidal  rule,  we  construct  an  approximation  valid  over  a  range 
of  values  5  <  r  <  R  with  accuracy  e,  of  the  form 


K, 


fi,a 


M 

V)  -  £  wme~Trnr2 

m— 1 


A  e^o,a 


r  =  e 


T(3/2  -  a)  ■  Ca 

2r3-2a 


(15) 


where  rm  =  e2Sm,  =  hCae~tj2e~2srn ^+^2o^Srn ,  h  =  (B  —  A)/M  and  sm  = 

A  +  mh.  The  limits  of  integration,  A,  B  and  the  step  size  h  are  selected  as 
indicated  in  [24],  where  it  is  shown  that  for  a  fixed  accuracy  e  the  number  of 
terms  M  in  (15)  is  proportional  to  log(Ah~1).  Although  it  is  possible  to  select 
6  and  R  following  the  estimates  in  [25]  and  optimize  the  number  of  terms  for 
a  desired  accuracy  e,  in  this  paper  we  start  with  an  approximation  that  has  an 
obviously  excessive  range  of  validity  and  thus,  an  excessive  number  of  terms. 

An  example  of  such  approximation  is  shown  in  Figure  4.  For  a  requested  toler¬ 
ance  of  e  =  10~10,  roughly  300  terms  are  enough  to  provide  a  valid  approxima¬ 
tion  over  a  range  of  15  decades.  We  then  let  the  algorithm  choose  the  necessary 
terms,  scale  by  scale,  to  satisfy  the  user-supplied  accuracy  requirement  e.  This 
approach  may  end  up  with  a  few  extra  terms  on  some  scales  in  comparison 
with  that  using  a  nearly  optimal  number  of  terms  [8,10,11,9].  Whereas  the 
cost  of  applying  a  few  extra  terms  is  negligible,  we  gain  significantly  in  having 
a  much  more  flexible  and  general  algorithm. 

We  note  that  approximation  in  (15)  clearly  reduces  the  problem  of  applying 
the  operator  to  that  of  applying  a  sequence  of  Gauss  transforms  [26,27],  one 
by  one.  From  this  point  of  view,  the  algorithm  that  we  present  may  be  con¬ 
sidered  as  a  procedure  for  applying  a  linear  combination  of  Gauss  transforms 
simultaneously. 

In  order  to  represent  the  kernel  K  of  the  operator  in  multiwavelet  bases,  we 
need  to  compute  the  integrals, 
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Figure  4.  Relative  error  of  the  Gaussian  approximation  for  the  Poisson  kernel  in  3 
dimensions.  This  unoptimized  expansion  uses  299  terms  to  cover  a  dynamic  range 
of  roughly  15  decades  with  e  =  10”10  relative  accuracy. 


*l*li*2*2’*3*3 


M 

=  E 

m= 1 


2  33  Jb  wme  Tml12  ,(x+f)ll2  ^ (x1)  ^i2i‘2(x2)  dx, 


(16) 

where  <&a/(x)  are  the  cross-correlations  of  the  scaling  functions  (see  Appendix). 
We  obtain 


where 


jj',1  _  ...  Tp3',m,h  T?j;m,l 2  T?j',m,h 

*lh’®2®2’®3*3  ^  m  *1*1  *2*2  *3*3 

ra=l 

rS”-'  =  ^  J\-r^+,>7'vi'ii,(x)dx. 


(U) 

(18) 


Since  the  Gaussian  kernel  is  not  homogeneous,  we  have  to  compute  integrals 
(18)  for  each  scale.  Although  in  principle  l  G  Z,  in  the  next  section  we  explain 
how  to  restrict  it  to  a  limited  range  on  each  scale,  for  a  given  accuracy  e. 


4  Modified  ns-form  of  a  multidimensional  operator 


In  this  section,  we  describe  how  the  separated  representation  approximations 
of  Section  3  can  be  used  to  construct  a  multidimensional  extension  of  the  ns- 
form  representation  from  Section  2.3,  using  only  one-dimensional  quantities 
and  norm  estimates.  This  makes  our  approach  viable  for  d  >  1.  We  use  the 
modified  ns-form  as  in  the  one-dimensional  case  described  in  Section  2.3.  We 
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find  the  ns-form  essential  for  adaptive  algorithms  in  more  than  one  dimension, 
since: 

(1)  Scales  do  not  interact  as  the  operator  is  applied.  All  interactions  between 
scales  are  accounted  for  by  the  (inexpensive)  projection  at  the  final  step 
of  the  algorithm. 

(2)  For  the  same  reason,  the  subdivision  of  space  at  different  scales  natu¬ 
rally  maps  into  the  supporting  data  structures.  We  note  that  one  of  the 
main  difficulties  in  developing  adaptive  algorithms  is  in  organizing  com¬ 
putations  with  blocks  of  an  adaptive  decomposition  of  a  function  from 
different  scales  but  with  a  common  boundary.  The  methods  for  such  com¬ 
putations  are  known  as  mortar  elements  methods.  In  our  approach  this 
issue  does  not  present  any  obstacle,  as  all  relevant  interactions  are  natu¬ 
rally  accounted  for  by  the  data  structures. 

The  key  feature  that  makes  our  approach  efficient  in  dimensions  d  >  2  is 
the  separated  structure  of  the  modified  ns-form.  Namely,  the  blocks  of  T j  —  j 
(T^i)  are  of  the  form  (for  d  =  3) 


M 


Tj—  T  (T <_1)=  Y  wmFj’mhFj’ml2Fj’ml3 

m= 1 

-  |  (  Y  w mFj~1'Ml Fj~v'mh Fj~v,'ml3\ 

\m=l  / 

M 

—  ^  w  pj;mh  jpj;rnl2 pj\ml3 

Y  wm  T  ( Fj-1;mh )•  t  (FJ'-1;mZ2)-  t  (FJ-1;m/3) 


m=  1 


m—  1 


(19) 


As  in  the  case  d—  1,  the  norm  of  the  operator  blocks  of  T|—  \  (T^_x)  decays 
rapidly  with  ||£||,  l  =  (/i,  /2, 13),  and  the  rate  of  decay  depends  on  the  number 
of  vanishing  moments  of  the  basis  [4],  Moreover,  we  limit  the  range  of  shift 
indices  ||f?||  using  only  one-dimensional  estimates  of  the  differences 


F: 


j;  m,2/ 


t  Fj:,m'1  and  F: 

1  it  1 


,j-,  m,2l+l 


T  Fr) 

1  7.7. 


(20) 


of  operator  blocks  computed  via  (18).  The  norms  of  individual  blocks  FhmJ 
are  illustrated  in  Figure  5  (a),  for  scale  j  =  1. 


By  selecting  the  number  of  vanishing  moments  for  a  given  accuracy,  it  is  suffi¬ 
cient  to  use  Halloo  <  2  in  practical  applications  that  we  have  encountered.  Also, 
not  all  terms  in  the  Gaussian  expansion  of  an  operator  need  to  be  included 
since,  depending  on  the  scale  j,  their  contribution  may  be  negligible  for  a 
given  accuracy,  as  shown  in  Figure  5  (b).  Below  we  detail  how  we  select  terms 
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of  the  Gaussian  expansion  on  a  given  scale  as  well  as  the  significant  blocks  of 
T^—  |  (T'J_1).  This  procedure  establishes  the  band  structure  of  the  operator. 

We  then  project  the  banded  operator  j  (T^_i)  back  to  the  scale  j  —  1  and 
then  combine  blocks  on  the  natural  scale  for  each  projection  in  order  to  apply 
the  operator  efficiently  as  was  explained  in  Section  2.3  for  the  one-dimensional 
case. 

We  note  that  in  deciding  which  terms  to  keep  in  (19),  we  do  not  compute  the 
difference  between  the  full  three  dimensional  blocks  as  it  would  carry  a  high 
computational  cost;  instead  we  use  estimates  based  on  the  one  dimensional 
blocks  of  the  separated  representation.  We  note  that  since  the  resulting  band 
structure  depends  only  on  the  operator  and  the  desired  accuracy  of  its  ap¬ 
proximation,  one  of  the  options  is  to  store  such  band  information  as  it  is  likely 
to  be  reused. 

In  order  to  efficiently  identify  the  significant  blocks  in  T)  —  f  (T^_1)  as  a 
function  of  £,  we  develop  norm  estimates  based  only  on  the  one-dimensional 
blocks.  The  difference  between  two  terms  of  the  separated  representation,  say 
F3F2F3  —  GiG2G3,  may  be  written  as 

F1F2F3  -  GiG2G3  =  (F,  -  G3)F2F3  +  G3{F2  -  G2)F3  +  GiG2(F3  -  G3). 

We  average  six  different  combinations  of  the  three  terms  to  include  all  direc¬ 
tions  in  a  symmetric  manner,  which  results  in  the  norm  estimate 


\\FiF2F3  —  GiG2G3||  <  -sym  [\\Fi  —  Gi||  ||F2||  ||F3||  +  (21) 

||G1||||F2-G2||||F3||  +  ||G1||||G2||||F3-G3||], 

where  the  symmetrization  is  over  the  three  directions  and  generates  18  terms. 
For  the  rotationally  symmetric  operators  with  Gaussian  expansion  as  in  (15) 
computing  the  right  hand  side  in  this  estimate  involves  just  three  types  of  one 
dimensional  blocks  and  their  norms, 


*dif 

Npm'1  = 

ATj-,m;l  _ 

JVT  F  — 


|  (FJ_1;m;h 


(22) 


where  index  j  indicates  the  scale,  m  the  term  in  the  Gaussian  expansion  (15), 
and  l  the  position  of  the  block  in  a  given  direction. 

These  estimates  allow  us  to  discard  blocks  whose  norm  falls  below  a  given 
threshold  of  accuracy,  namely,  for  each  multi-index  £,  we  estimate 
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Separation  index 

(a)  Norms  of  each  one-dimensional 
block  computed  via  (18)  for  scale  j  =  1, 
as  a  function  of  the  index  m,  in  the  sep¬ 
arated  representation. 


Separation  index 


(b)  Norm  estimates  (23)  for  scale  j  =  1 
as  a  function  of  the  index  m  in  the  sep¬ 
arated  representation.  Based  on  these 
estimates,  only  terms  above  the  cutoff 
are  actually  applied. 


Figure  5.  Comparison  of  norms  of  matrix  blocks  generated  by  the  Gaussian  ap¬ 
proximation  for  the  Poisson  kernel  in  dimension  d  =  3.  In  each  picture,  the  curves 
correspond  to  the  different  offsets  l  for  which  blocks  are  generated.  Figure  (b)  illus¬ 
trates  the  estimate  in  (23)  (see  main  text  for  details). 


For  each  scale  j  and  each  block  T^  —  j  (T^_x)  labeled  by  the  multi-index 
i  —  (Zi ,  Z2 ,  h),  we  compute  all  terms  of  the  sum  in  (23)  and  identify  the  range 
[m1,m2]  which  we  need  to  keep  for  that  block,  by  discarding  from  the  sum 
terms  whose  cumulative  contribution  is  below  e.  If  the  entire  sum  falls  below  e, 
this  range  may  be  empty  and  the  entire  j  (T^_1)  is  discarded.  The  range 
[mi,  m2]  differs  significantly  depending  whether  or  not  the  block  is  affected 
by  the  singularity  of  the  kernel  as  is  illustrated  in  Figure  5.  In  Figure  5  (a) 
the  rate  of  decay  for  the  blocks  with  shift  |Z|  =  2,  3  is  significantly  faster  than 
for  the  blocks  with  |Z|  <  1  affected  by  the  singularity.  We  note  that  all  blocks 
of  the  first  150  terms  in  the  separated  representation  (17)  have  norm  1  (and 
rank  1  as  matrices)  and  are  not  shown  in  Figure  5  (a). 

Since  the  difference  in  (23)  involves  blocks  upsampled  from  a  coarser  scale,  all 
shifts  |Z|  <  3  are  affected  by  the  singularity.  Figure  5  (b)  shows  the  r.h.s.  of 
the  estimate  in  (23)  for  different  shifts  along  one  of  the  directions,  where  the 
blocks  along  the  other  two  directions  are  estimated  by  the  maximum  norm 
over  all  possible  shifts. 

After  discarding  blocks  with  norms  less  that  e  using  the  estimate  in  (23),  we 
downsample  the  remaining  blocks  of  f  (T^_:)  back  to  the  original  scale.  This 
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leaves  only  blocks  of  Tj  on  the  scale  j  and  we  remove  additional  blocks  of 
for  the  shifts  |/|  =2,3  where  the  decay  is  sufficiently  fast  to  make  their 
contribution  less  than  e. 

This  leads  us  to  arrange  the  blocks  on  each  scale  into  several  subsets  by  the 
effect  the  singularity  of  the  kernel  has  on  them  and  fold  the  appropriate  range 
[mi,  m2]  for  each  subset.  There  are  three  such  sets  in  dimension  d  =  2  and 
four  sets  in  dimension  d  =  3.  For  each  index  /,  we  will  say  that  the  index 
belongs  to  the  core  if  l  =  —1,0,1  and  to  the  boundary  otherwise.  The  core 
indices  correspond  to  one-dimensional  blocks  whose  defining  integrals  include 
the  singularity  of  the  kernel.  We  then  divide  all  possible  values  of  the  multi¬ 
index  i  =  ( Z 1 ,  Z2 ,  Z3) ,  according  to  the  number  of  core  indices  it  has.  In  d  —  3 
this  gives  us  four  sets: 

•  Core:  all  indices  (Zi,Z2,Z3)  belong  to  the  core. 

•  Boundary- 1:  two  of  the  indices  belong  to  the  core  and  one  to  the  boundary 

•  Boundary-2:  one  of  the  indices  belongs  to  the  core,  the  other  two  to  the 
boundary 

•  Boundary-3:  all  indices  (h,l2,h)  belong  to  the  boundary. 

We  then  find  the  range  [mi, m2]  for  each  subset  and  apply  blocks  of  each 
subset  separately,  thus  avoiding  unnecessary  computations  with  blocks  whose 
contribution  is  negligible.  This  range  analysis  only  needs  to  be  done  once  per 
operator  and  the  desired  accuracy,  and  the  results  may  be  saved  for  repeated 
use. 


5  Multidimensional  adaptive  application  of  ns-form 


In  this  section  we  present  an  algorithm  for  applying  the  modified  non-standard 
form  which  is  an  extension  of  (2)  (based  on  (8)  and  (10))  to  higher  dimensions. 
We  are  now  seeking  to  compute 


#(x)  =  P7Kx)  =  j  K(y  -  x)/(y)dy> 

where  x,  y  G  Md  for  d  =  2,3.  The  separated  approximation  (19)  reduces  the 
complexity  of  applying  the  operator  by  allowing  partial  factorization  of  the 
nested  loops  in  each  scale  indicated  by  the  order  of  summation  and  illustrated 
for  d  =  3, 
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As  described  in  the  previous  section,  this  evaluation  is  done  by  regions  of  in¬ 
dices.  These  regions  of  indices  are  organized  so  that  (for  a  given  accuracy)  the 
number  of  retained  terms  of  the  separated  representation  is  roughly  the  same 
for  all  blocks  within  each  region.  Thus,  we  perform  the  summation  over  the 
terms  of  the  separated  representation  last,  applying  only  the  terms  that  actu¬ 
ally  contribute  to  the  result  above  the  requested  accuracy  threshold,  according 
to  estimate  (23).  Therefore,  we  avoid  introducing  checks  per  individual  block 
and  the  resulting  loss  of  performance. 

Just  as  in  the  one- dimensional  case,  we  use  (19)  in  a  ‘natural  scale’  manner. 
That  is,  blocks  belonging  to  scale  j  are  only  applied  on  that  scale.  As  in  one¬ 
dimensional  case,  the  interaction  between  scales  is  achieved  by  the  projection 
(10)  that  redistributes  blocks  accumulated  in  this  manner  properly  between 
the  scales  to  obtain  the  adaptive  tree  for  the  resulting  function.  The  overall 
approach  is  the  same  as  described  in  (2).  We  note  that,  as  expected,  the  sep¬ 
arated  representation  requires  more  detailed  bookkeeping  when  constructing 
the  data  structures  for  the  operator. 

Remark  1  Our  multiresolution  decomposition  corresponds  to  the  geometri¬ 
cally  varying  refinement  in  finite  element  methods.  In  this  case  the  adjoining 
boxes  do  not  necessarily  share  common  vertices,  forming  what  corresponds  to 
the  so-called  non-conforming  grid.  In  finite  element  methods  such  situation 
requires  additional  construction  provided  by  the  mortar  element  methods. 
Mortar  element  methods  were  introduced  by  Patera  and  his  associates,  see 
e.g.  [19,20,21,22],  These  methods  permit  coupling  discretizations  of  different 
types  in  non-overlapping  domains.  Such  methods  are  fairly  complicated  and 
involve,  for  example,  the  introduction  of  interface  conditions  through  an  L 2 
minimization.  In  our  approach  we  do  not  face  these  issues  at  all  and  do  not 
have  to  introduce  any  additional  interface  conditions.  The  proper  construc¬ 
tion  for  adjoining  boxes  is  taken  care  by  the  redundant  tree  data  structure 
and  Step  2  of  Algorithm  3  for  applying  the  kernel,  which  generates  the  neces¬ 
sary  missing  boxes  on  appropriate  scales. 

Remark  2  Although  Algorithm  3  applies  convolution  operators,  only  minor 
changes  are  needed  to  use  it  for  non-convolutions.  Of  course  in  such  case,  the 
separated  representation  for  the  modified  ns-form  should  be  constructed  by  a 
different  approach. 
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Algorithm  3  Adaptive  non-standard  form  operator  application  in  multiple 
dimensions  (illustrated  for  d  =  2),  g  =  T / 

Initialization:  Construct  the  redundant  tree  for  /  and  copy  it  as  skeleton 
tree  for  g  (see  Section  2.2.1). 
for  all  scales  j  —  0, . . . ,  n  —  1  do 

for  all  function  blocks  gj  t  in  the  tree  for  g  at  scale  j  do 

fj  Step  1.  Determine  the  list  of  all  Core,  Boundary-1  and  Boundary-2 
contributing  operator  blocks  of  the  modified  ns-form  F^,m'll~1^,  2_h 

(see  Section  f): 
if  <7/  j  belongs  to  a  leaf  then 

Read  operator  blocks  F^’m’11  ~li  ^  F^’m’l2~1^  for  Core,  Boundary-1  and 
Boundary-2,  and  their  weights  wm  and  corresponding  ranges  from 
the  separated  representation, 
else 

Read  operator  blocks  F^'m'-'ll~h,  F^m’l2~1^  for  Boundary-1  and 
Boundary-2,  and  their  weights  wm  and  corresponding  ranges  from 
the  separated  representation. 

end  if 

jf  Step  2.  Find  the  required  blocks  f(  of  the  input  function  f: 
if  function  block  f(,t,  exists  in  the  redundant  tree  for  /  then 
Retrieve  it. 
else 

Create  it  by  interpolating  from  a  coarser  scale  and  cache  for  reuse. 

end  if 

fj  Step  3.  Output  function  block  computation: 

For  each  set  S  of  indices  determined  in  Step  1  (Core,  Boundary- 1, 
Boundary-2)  and  the  corresponding  ranges  of  terms  in  the  separated 
representation,  compute  the  sum 


9hh 


=  £ 


VJ  r 


£ 


Add  all  computed  sums  to  obtain  gj  i  ■ 

end  for 
end  for 

Step  4 ■  Adaptive  projection: 

Project  resulting  output  function  blocks  gf  t  on  all  scales  into  a  proper 
adaptive  tree  by  using  Eq.  (5). 

Discard  from  the  resulting  tree  unnecessary  function  blocks  at  the  requested 
accuracy. 

Return:  the  function  g  represented  by  its  adaptive  tree. 
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5.1  Operation  count  estimates 


Adaptive  decomposition  of  functions 


The  cost  of  adaptively  decomposing  a  function  in  d  dimensions  is  essentially 
that  of  an  adaptive  wavelet  transform.  Specifically,  it  takes  O ( A’biocks  ■  pd+1) 
operations  to  compute  such  representation,  where  A^iocks  is  the  final  number 
of  significant  blocks  in  the  representation  and  p  is  the  order  of  multiwavelet 
basis  chosen.  In  comparison  with  the  usual  wavelet  transform,  it  appears  to 
be  significantly  more  expensive.  However,  these  0(pd+l )  operations  process 
0(pd )  points,  thus  in  counting  significant  coefficients  as  it  is  done  in  the  usual 
adaptive  wavelet  transform,  we  end  up  with  0(p)  operations  per  point. 


Operator  application 


The  cost  of  applying  an  operator  in  the  modified  ns-form  is  0(NuOcksMpd+1), 
where  IVbiocks  is  the  number  of  blocks  in  the  adaptive  representation  of  the 
input  function,  M  is  the  separation  rank  of  the  kernel  in  (15)  and  p  is  the 
order  of  the  multiwavelet  basis.  For  a  given  desired  accuracy  e,  we  typically 
select  p  oc  loge-1;  M  has  been  shown  to  be  proportional  to  (loge^1)1',  where  v 
depends  on  the  operator  [24].  In  our  numerical  experiments,  M  is  essentially 
proportional  to  loge-1,  since  we  never  use  the  full  separated  representation, 
as  discussed  in  Section  4  and  illustrated  in  Figure  5. 


This  operation  count  can  be  potentially  reduced  to  0(yN\)\oc\iSMpd)  by  using 
the  structure  of  the  matrices  in  18,  and  we  plan  to  address  this  in  the  future. 


Final  projection 


After  the  operator  has  been  applied  to  a  function  in  a  scale-independent  fash¬ 
ion,  a  final  projection  step  is  required  as  discussed  in  Section  2.3.  This  step 
requires  C,(Arbiocks  • Pd )  operations,  the  same  as  in  the  original  function  decom¬ 
position.  In  practice,  this  time  is  negligible  compared  to  the  actual  operator 
application. 
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6  Numerical  examples 


6.1  The  Poisson  equation 

We  illustrate  the  performance  of  the  algorithm  by  solving  the  Poisson  equation 
in  d  =  3 


V2<Mr)  =  — p(  r) 


(24) 


with  free  space  boundary  conditions,  0(r)  — >  0  and  d<p/dr  — >  0  as  r  — >  oo.  We 
write  the  solution  as 


and  adaptively  evaluate  this  integral.  We  note  that  our  method  can  equally 
be  used  for  d  —  2,  since  the  corresponding  Green’s  function  can  also  be  rep¬ 
resented  as  a  sum  of  Gaussians,  and  the  operator  application  algorithm  has 
been  implemented  in  for  both  d  =  2  and  d  =  3. 

For  our  test  we  select 


i= 1 


so  that  we  solve  the  Poisson  equation  with 


Our  parameters  are  chosen  as  follows:  a  =  300,  rq  =  (0.5, 0.5, 0.5),  r2  = 
(0.6,  0.6,  0.5)  and  r3  =  (0.35,0.6,0.5).  These  ensure  that  p{r)  is  well  below 
our  requested  thresholds  on  the  boundary  of  the  computational  domain.  All 
numerical  experiments  were  performed  on  a  Pentium-4  running  at  2.8  GHz, 
with  2  GB  of  RAM.  The  results  are  summarized  in  Table  2. 

In  order  to  gauge  the  speed  of  algorithm  in  reasonably  computer-independent 
terms,  we  use  a  similar  approach  to  that  of  [17]  and  also  provide  timings  of 
the  Fast  Fourier  Transform  (FFT).  Specifically,  we  display  timings  for  two 
FFTs  as  an  estimate  of  the  time  needed  to  solve  the  Poisson  equation  with 
a  smooth  right  hand  side  and  periodic  boundary  conditions  in  a  cube.  As  in 
[17],  we  compute  the  rate  that  estimates  the  number  of  processed  points  per 
second.  We  observe  that  for  our  adaptive  algorithm  such  rate  varies  between 
3.4  •  104  and  1.1  •  105  (see  Table  2),  whereas  for  the  FFTs  it  is  around  106  (see 
Table  3).  We  note  that  our  algorithm  is  not  fully  optimized,  namely,  we  do  not 
use  the  structure  of  the  matrices  in  (18)  and  the  symmetries  afforded  by  the 
radial  kernels.  We  expect  a  substantial  impact  on  the  speed  by  introducing 
these  improvements  and  will  report  them  separately. 
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Requested  e  =  10  3 


p 

e2 

Eoo 

Time  (s) 

Rate  (pts/s) 

6 

CO 

1 

o 

T— 1 

o 

to 

7.9  •  10”1 

1.2 

7.2  •  104 

8 

1.7-  10“3 

1.2  •  10_1 

0.51 

7.3  •  104 

10 

4.4  •  10“4 

3.7-  10"2 

0.68 

1.1  •  105 

Requested  e  =  10  6 


P 

e2 

Eoo 

Time  (s) 

Rate  (pts/s) 

10 

4.7-  10“6 

3.6  ■  10-4 

10.3 

5.7-  104 

12 

8.5  •  10“6 

4.3  ■  10"5 

13.5 

7.5  •  104 

14 

6.9  •  10-8 

5.2  ■  10-6 

20.0 

8.0  •  104 

Requested  e  =  10  9 


P 

e2 

Eoo 

Time  (s) 

Rate  (pts/s) 

16 

2.5  •  10~10 

2.2  •  10-8 

68.1 

3.5  •  104 

18 

7.7  •  10~n 

3.5  •  10“9 

100.3 

3.4-  104 

20 

1.2  •  10“10 

1.8  •  10“8 

133.4 

3.5  •  104 

Table  2 

Accuracy  and  timings  for  the  adaptive  solution  of  the  Poisson  equation  in  (24). 


size 

Time  (s) 

Rate  (pts/s) 

323 

0.02 

1.7-  106 

643 

0.22 

1.2  •  106 

1283 

2.21 

9.5  •  105 

2563 

20.7 

8.1  •  105 

Table  3 

Timings  of  two  3D  FFTs  to  estimate  the  speed  of  a  non-adaptive,  periodic  Poisson 
solver  on  a  cube  for  smooth  functions. 


We  note  that  the  multigrid  method  (see  e.g.  [28,29])  is  frequently  used  as  a 
tool  for  solving  the  Poisson  equation  (and  similar  problems)  in  differential 
form.  The  FFT-based  gauge  suggested  in  [17]  is  useful  for  comparisons  with 
these  algorithms  as  well. 
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Figure  6.  A  two-dimensional  slice  of  the  three-dimensional  subdivision  of  space  by 
the  scaling  functions  and  an  illustration  of  the  source  term  for  the  Poisson  equation 
(24). 

6.2  The  ground  state  of  the  hydrogen  atom 


A  simple  example  of  computing  the  ground  state  of  the  hydrogen  atom  illus¬ 
trates  the  numerical  performance  of  the  algorithms  developed  in  this  paper, 
and  their  utility  for  constructing  more  complex  codes.  The  eigenfunctions 
0  for  the  hydrogen  atom  satisfy  the  time-independent  Schrodinger  equation 
(written  in  atomic  units  and  spherical  coordinates), 

-^A0--0  =  £0,  (25) 

2  r 

where  E  is  the  energy  eigenvalue.  For  the  ground  state,  E  =  —1/2  and  the 
(unnormalized)  wave  function  is  0  =  e~r .  Following  [30],  we  write 

0  =  -2G0F0,  (26) 

where  G0  =  (—A  +  ii2 3Z)~l  is  the  Green’s  function  for  some  /i  and  V  =  —  1/r 
is  the  nuclear  potential.  For  /i  =  \J—2 E  the  solution  0  of  (26)  has  ||0||2  =  1 
and  coincides  with  that  of  (25).  We  solve  (26)  by  a  simple  iteration  starting 
from  some  value  /io  and  changing  /i  to  obtain  the  solution  with  ||0||2  =  1-  The 
algorithm  proceeds  as  follows: 

(1)  Initialize  with  some  value  /io  and  function  0.  The  number  of  iterations  of 
the  algorithm  is  only  weakly  sensitive  to  these  choices. 

(2)  Compute  the  product  of  the  potential  V  and  the  function  0. 

(3)  Apply  the  Green’s  function  G0  to  the  product  V 0  via  the  algorithm  of 


Figure  7.  Convergence  of  the  iteration  to  obtain  the  ground  state  of  the  hydrogen 
atom  computed  via  formulation  in  (26)  for  the  non-relativistic  Schrodinger  equation. 
The  requested  accuracy  in  applying  the  Green’s  function  is  set  to  10~6. 

this  paper  to  compute 

4*new  ^GfjVcf). 

(4)  Compute  the  energy  for  0new), 


Er.Pir 


new  i  v</>  new)  T  ( V” 0new 5  0new ) 
{ ,4*newi  4*new) 


(5)  Set  Id  =  y/-2 Enew,  4>  =  4>new / W&new ||  and  return  to  Step  2. 


The  iteration  is  terminated  as  the  change  in  /i  and  ||</>||  —  1  falls  below  the 
desired  accuracy.  The  progress  of  the  iteration  is  illustrated  in  Figure  7.  The 
computations  in  Steps  2  and  4  use  the  three  dimensional  extension  of  the 
approach  described  in  [13],  to  compute  point-wise  multiplications  of  adaptively 
represented  functions  and  weak  differential  operators  of  the  same. 


This  example  illustrates  an  application  of  our  algorithm  to  problems  in  quan¬ 
tum  chemistry.  Multiresolution  quantum  chemistry  developed  in  [8,9,10,11] 
also  uses  separated  representations.  The  main  technical  difference  with  [8,9,10,11] 
is  that  we  use  the  modified  non-standard  form  and  apply  operator  blocks  on 
their  “natural”  scale,  thus  producing  a  fully  adaptive  algorithm.  We  are  cur¬ 
rently  using  this  algorithm  as  part  of  a  new  method  for  solving  the  multipar¬ 
ticle  Schrodinger  equation  and  will  report  the  results  elsewhere. 


7  Discussion  and  conclusions 


We  have  shown  that  a  combination  of  separated  and  multiresolution  repre¬ 
sentations  of  operators  yields  a  new  multidimensional  algorithm  for  applying 
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a  class  of  integral  operators  with  radial  kernels.  We  note  that  the  same  algo¬ 
rithm  is  used  for  all  such  operators  as  they  are  approximated  by  a  weighted 
sum  of  Gaussians.  This  fact  makes  our  approach  applicable  across  multiple 
fields,  where  a  single  implementation  of  the  core  algorithm  can  be  reused  for 
different  specific  problems.  The  algorithm  is  fully  adaptive  and  avoids  issues 
usually  addressed  by  mortar  methods. 

The  method  of  approximation  underlying  our  approach  is  distinct  from  that  of 
the  FMM,  has  similar  efficiency  and  has  the  advantage  of  being  more  readily 
extendable  to  higher  dimensions.  We  also  note  that  semi-analytic  approxima¬ 
tions  via  weighted  sums  of  Gaussians  provide  additional  advantages  in  some 
applications.  Although  we  described  the  application  of  kernels  in  free  space, 
there  is  a  simple  extension  to  problems  with  radial  kernels  subject  to  periodic, 
Dirichlet  or  Neumann  boundary  conditions  on  a  cube  that  we  will  describe 
separately. 

The  algorithm  may  be  extended  to  classes  of  non-convolution  operators,  e.g., 
the  C  alder  on- Zygmund  operators.  For  such  extensions  the  separated  represen¬ 
tation  may  not  be  available  in  analytic  form,  as  it  is  for  the  operators  of  this 
paper,  and  may  require  a  numerical  construction.  The  separated  representa¬ 
tion  of  the  kernel  permits  further  generalization  of  our  approach  to  dimensions 
d  3  for  applying  operators  to  functions  in  separated  representation. 

A  notable  remaining  challenge  is  an  efficient,  high  order  extension  of  this  tech¬ 
nique  to  the  application  of  operators  on  domains  with  complicated  geometries 
and  surfaces. 


8  Appendix 

8. 1  Scaling  functions 


We  use  either  the  Legendre  polynomials  P0l . . . ,  Pp-\  or  the  interpolating  poly¬ 
nomials  on  the  Gauss-Legendre  nodes  in  [—1,1]  to  construct  an  orthonormal 
basis  for  each  subspace  Vj  [12,13]. 

Let  us  briefly  describe  some  properties  of  the  Legendre  scaling  functions  </>*,, 
k  —  0, . . . ,  p  —  1,  defined  as 


1/N  v^fe  +  lPfc(2x-l),a;e[0,l] 

Mx)  =  \ 

o,  xi  [o,  i] 


(27) 


and  forming  a  basis  for  Vq.  The  subspace  Vj  is  spanned  by  2 ip  functions 
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obtained  from  0O,  •  •  • ,  4>P-i  by  dilation  and  translation, 

4i(x)  =  2>/2M2jx  ~ l ),  k  =  0,...,p-l,  l  =  0, . . . ,  2?  -  1.  (28) 

These  functions  have  support  on  [2~H,2~^(l  +  1)]  and  satisfy  the  orthonor- 
mality  condition 

poo 

/  =  Skk'Sw  •  (29) 

A  function  /,  defined  on  [0,1],  is  represented  in  the  subspace  Vj  by  its  nor¬ 
malized  Legendre  expansion 


fix) 


2J  —  1  p-1 

554i0fci(aO, 


1=0  k= 0 


(30) 


where  the  coefficients  sh  are  computed  via 

r2-*(l+l) 

s{i=  f{x)^kl{x)dx.  (31) 

J  2~ol 

As  long  as  f(x)  is  smooth  enough  and  is  well  approximated  on  [2~H,  2-J  (/  +  l)] 
by  a  polynomial  of  order  up  to  2p —  1,  we  may  use  Gauss-Legendre  quadratures 
to  calculate  the  s°kl  via 

p-i 

sii  =  2_J/2  55  /(2“J(xi  +  l))</>k(xi)wi,  (32) 

i= 0 

where  x0, . . . ,  xp_i  are  the  roots  of  Pp(2x  —  1)  and  w0) . . . ,  u>p_i  are  the  corre¬ 
sponding  quadrature  weights. 

In  more  than  one  dimension,  the  above  formulas  are  extended  by  using  a  tensor 
product  basis  in  each  subspace.  For  example,  in  two  dimensions  equation  (30) 
becomes 

f(x,x')  =  55  55  55  55  (33) 

1=0  k= 0 1'= 0  k’= 0 


8.2  Multiwavelets 


We  use  piecewise  polynomial  functions  xpo,  •  ■  ■ ,  f>p- 1  as  an  orthonormal  basis 
for  W0  [12,13], 

/  'i/ji(x)ijjj(x)dx  =  Sij.  (34) 

•J  0 

Since  W0  _L  V0,  the  first  p  moments  of  all  ^o,  •  •  • ,  VV-i  vanish: 


'4>i{x)xldx  =  0,  i,  j  —  0, 1, . . .  ,p  —  1. 


(35) 
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The  space  W3-  is  spanned  by  2 functions  obtained  from  ^0,  ■  •  • ,  i>p-i  by 
dilation  and  translation, 

^h(x)  =  2j/2'0fc(2-?x-/),  k  =  0,...,p-l,  1  =  0,...,2J-1,  (36) 

and  supported  in  the  interval  Iji  =  [2~H,  2~i(l  +  1)].  A  function  f(x)  defined 
on  [0, 1]  is  represented  in  the  multiwavelet  basis  on  n  scales  by 


p—  1  n-l  2J  -  lp— 1 

f(x)  =  E  sl,oMx)  +  EEE  dii^ki(x) 

k= 0  j= 0  i=0  fc=0 


(37) 


with  the  coefficients  dkl  computed  via 


dh  =  /  f(x)^h(x)dx- 

J2~ol 


(38) 


8.3  Two-scale  relations 


The  relation  between  subspaces,  V0©  W0  =  V1;  is  expressed  via  the  two-scale 
difference  equations, 


p— i 

Mx)  =  {hkk>M2x)  +  2x  ~  !))  >  k  =  0,...,p-l,  (39) 

k'= o 
p- 1 

^k(x)  =  ^2^2  +  9ki<t>w{?x  -  1))  ,  k  =  0, . . .  ,p  -  1,  (40) 

fc'= o 

where  the  coefficients  /t^  and  r/^ ,  r/j-j  •*  depend  on  the  type  of  polyno¬ 
mial  basis  used  (Legendre  or  interpolating)  and  its  order  p.  The  matrices  of 
coefficients 

tf<0)  =  {'*2b.  ff(I)  =  {*>$}.  G(0>  =  {g£!},  G<‘>  =  {g$} 

are  the  multiwavelet  analogues  of  the  quadrature  mirror  filters  in  the  usual 
wavelet  construction,  e.g.,  [15].  These  matrices  satisfy  a  number  of  important 
orthogonality  relations  and  we  refer  to  [13|  for  complete  details,  including  the 
construction  of  the  H,  G  matrices  themselves.  Let  us  only  state  how  these 
matrices  are  used  to  connect  the  scaling  skl  and  wavelet  dJkl  coefficients  on 
neighboring  scales  j  and  j  +  1.  The  decomposition  procedure  (j  +  1  — >  j)  is 
based  on 
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(41) 


p- 1 


<J  _ 
bkl  ~ 


V^V+Ul,11)  si+1  ) 

/  ,  ylkk'bk,2l  +  'bcfc'^MZ+l J  ) 


fc'=0 

P-1 


rjj  -  V  |V0)  Si+1  4-  0{1)  Si+1  )  ■ 
afcZ  —  \flkk'bk,2l  +  9kk'bk,2l+l)  > 


(42) 


fc'=0 


the  reconstruction  (j  — >  j  +  1)  is  based  on 


p— i 

V  |V0V 
/  ,  yLkklbkl 

k'= 0 


S 


J+l 

fc,2i+l 


p— 1 

/  ,  \'Lkklbkl 
k'= 0 


(43) 

(44) 


8-4  Cross- correlation  of  the  scaling  functions 


For  convolution  operators  we  only  need  to  compute  integrals  with  the  cross¬ 
correlation  functions  of  the  scaling  functions, 

/OO 

4>i(x  +  y)4>i'(y)dy.  (45) 

-oo 

Since  the  support  of  the  scaling  functions  is  restricted  to  [0, 1],  the  functions 
<3h.j/  are  zero  outside  the  interval  [—1,1]  and  are  polynomials  on  [—1,0]  and 
[0, 1]  of  degree  i  +  i'  +  1, 


®u'(x) 


where  i,i'  =  0, . . .  ,p  —  1  and 


(x),  0  <  x  <  1, 

<  <l\i>{x)-,  -1  <  x  <  0, 

0,  1  <  |x|, 


(46) 


*«>(*)  =  /  <k(x  +  v)  <fo(v)dy ,  Oe(i)  =  f  'j>i(x  +  y)'j>l,{y)dy.  (47) 
Jo  J-X 

We  summarize  relevant  properties  of  the  cross-correlation  functions  &a>  in 

Proposition  3 

(1)  Transposition  of  indices:  <fv(x)  =  (— 1  Y+l'  Qi'fx) , 

(2)  Relations  between  <F+and  <h~:  $“■,(— x)  =  (— l)*+*,<Fh,(x)  for  0  <  x  <  1, 
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(3)  Values  at  zero:  <3v(0)  =  0  and  $^(0)  =  1  for  i  —  0, . . .  ,p  —  1, 

(4)  Upper  bound:  max^g^i^]  |$,j/(x)|  <  1  fori,i'  —  0,  ...,p  —  1, 

(5)  Connection  with  the  Gegenbauer  polynomials: 

$oo(x)  =  |  C[~1/2\2x  -  1)  +  |  and  =  \  a/2/  +  1  C'^11/2)(2x  -  1); 

for  l  =  1,2, ,  where  C[+l^2^  is  the  Gegenbauer  polynomial, 


(6)  Linear  expansion:  if  i!  >i  then  we  have 


$.v(z) 


cii'®Ux)i 


(48) 


where 


4i((  +  1)  Jo1  t*,(x)t+  (i)(l  -  (2x  -  l)2)-1^, 


i'  >  i, 


4((f  +  1)  -  $o+oM)  tJWtl  -  (2x 

for  l  >  1  and  c %  =  5^ . 


I)2)  1dx,  i'  =  i, 
(49) 


(7)  Vanishing  moments:  we  have  f\$00(x)  dx  =  1  and  f\  xk$a'(x)  dx  =  0 
for  i  +  i'  >  1  and  0<k<i  +  i'—  1. 

Proof  of  these  properties  can  be  found  in  [25]. 


8.5  Separated  representations  of  radial  functions 


As  an  example,  consider  approximating  the  function  1  jra  by  a  collection  of 
Gaussians.  The  number  of  terms  needed  for  this  purpose  is  mercifully  small. 
We  have  [24] 

Proposition  4  For  any  a  >  0,  0  <  5  <  1,  and  0  <  e  <  min  j|,  ^j;  there 
exist  positive  numbers  rm  and  wm  such  that 


M 

X  wme~Tmr2 


m=  1 


<  r 


A,  for  all  5  <  r  <  1 


(50) 


with 

M  =  log  e"1  [c0  +  C!  log  e_1  +  c2  log  h'1] ,  (51) 

where  C&  are  constants  that  only  depend  on  a.  For  fixed  power  a  and  accuracy 
e,  we  have  M  =  (9(log<5-1). 
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The  proof  of  Proposition  4  in  [24]  is  based  on  using  the  trapezoidal  rule  to  dis¬ 
cretize  an  integral  representation  of  1  jra.  Similar  estimates  may  be  obtained 
for  more  general  radial  kernels  using  their  integral  representations  as  in  (14). 

We  note  that  approximations  of  the  function  1/r  via  sums  of  Gaussians  have 
been  also  considered  in  [31,32,33]. 


8.6  Estimates 


By  selecting  appropriate  e  and  6  in  the  separated  representations  of  a  radial 
kernel  K  (as  in  Proposition  4),  we  obtain  a  separated  approximation  for  the 
coefficients 

=  2~3j  [  13  A'(2“j(x  +  £))  i)  (x2)  $ kk'(x3 )  dx .  (52) 

Since  the  number  of  terms,  M,  depends  logarithmically  on  e  and  S ,  we  achieve 
any  finite  accuracy  with  a  very  reasonable  number  of  terms.  For  example,  for 
the  Poisson  kernel  K(r)  =  l/47rr,  we  have  the  following  estimate  [25], 

Proposition  5  For  any  e  >  0  and  0  <  S  <  1  the  coefficients  in  (52) 

have  an  approximation  with  a  low  separation  rank, 


7  ii'  ,jjf  ,kk' 


M 


E 

m= 1 


W. 


pm, 1 1 

m  r  a’ 


pm,l2  pm,l3 
”  jj'  ”  kh' 


i 


such  that  if  maxi  \  f\  >  2,  then 


\fj-A 

v'iV  ,jj'  ,kk' 


j-J 

^ii'  ,jj'  ,kk' 


<  c02-^e, 


and  z/maxj  |/j|  <  1,  then 


1 1 


j-J 

iV  ,jj'  ,kk' 


rrA 

ii',jj',kk' 


<  2  2j(ci52  +  c0e) 


(53) 


(54) 


(55) 


where  e,  S,  M,  rm;  wm,  m  =  1, . . . ,  M  are  described  in  Proposition  f  for  a  =  1 
and  Co  and  C\  are  (small)  constants. 

As  described  in  Section  4,  our  adaptive  algorithm  selects  only  some  of  the 
terms,  as  needed  on  a  given  scale  for  the  desired  accuracy  e. 
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8.7  Evaluation  of  integrals  with  the  cross- correlation  functions 


We  need  to  compute  integrals  in  (18),  where  the  cross-correlation  functions 
are  given  in  (45).  We  note  that  using  (48),  it  is  sufficient  to  compute 

FitT’1  =  ~  j\-^x+l^/4%i0(x)dx 


for  0  <  i  <  2p  —  1  rather  than  p 2  integrals  in  (18).  Using  the  relation  between 
<f>_  and  $+  in  Proposition  3,  we  have 


^-0(x)e~T^x+l)2  dx 


e~T^x~l)2dx, 


so  that 


X  iO 


^e-Tm{x+l)2 


+  (— l)*e  Tm(x  l)2]$f0(x)dx. 
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Abstract 

The  wavefunction  for  the  multiparticle  Schrodinger  equation  is  a  function  of  many 
variables  and  satisfies  an  antisymmetry  condition,  so  it  is  natural  to  approximate  it  as 
a  sum  of  Slater  determinants.  Many  current  methods  do  so,  but  they  impose  additional 
structural  constraints  on  the  determinants,  such  as  orthogonality  between  orbitals  or 
an  excitation  pattern.  We  present  a  method  without  any  such  constraints,  by  which 
we  hope  to  obtain  much  more  efficient  expansions,  and  insight  into  the  inherent  struc¬ 
ture  of  the  wavefunction.  We  use  an  integral  formulation  of  the  problem,  a  Green’s 
function  iteration,  and  a  fitting  procedure  based  on  the  computational  paradigm  of 
separated  representations.  The  core  procedure  is  the  construction  and  solution  of  a 
matrix-integral  system  derived  from  antisymmetric  inner  products  involving  the  po¬ 
tential  operators.  We  show  how  to  construct  and  solve  this  system  with  computational 
complexity  competitive  with  current  methods. 
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an  Unconstrained  Sum  of  Slater  Determinants 
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I  Introduction 

Given  the  difficulties  of  solving  the  multiparticle  Schrodinger  equation,  current  numerical 
methods  in  quantum  chemistry/physics  are  remarkably  successful.  Part  of  their  success 
comes  from  efficiencies  gained  by  imposing  structural  constraints  on  the  wavefunction  to 
match  physical  intuition.  However,  such  methods  scale  poorly  to  high  accuracy,  and  are 
biased  to  only  reveal  structures  that  were  part  of  their  own  construction.  Our  goal  is  to 
develop  a  method  that  scales  well  to  high  accuracy  and  allows  an  unbiased  exploration  of  the 
structure  of  the  wavefunction.  In  this  paper  we  take  a  step  toward  this  goal  by  developing  a 
method  to  approximate  the  wavefunction  as  an  unconstrained  sum  of  Slater  determinants. 

Since  the  multiparticle  fermionic  wavefunction  is  an  antisymmetric  function  of  many 
variables,  it  is  natural  to  approximate  it  as  a  sum  of  Slater  determinants,  at  least  as  a  first 
step.  Motivated  by  the  physical  intuition  that  electrons  may  be  excited  into  higher  energy 
states,  the  Configuration  Interaction  (Cl)  family  of  methods  choose  a  set  of  determinants 
with  predetermined  orbitals,  and  then  optimize  the  coefficients  used  to  combine  them.  When 
it  is  found  insufficient,  methods  to  optimize  the  orbitals,  work  with  multiple  reference  states, 
etc.,  are  introduced  (along  with  an  alphabet  of  acronyms).  A  common  feature  of  all  these 
methods  is  that  they  impose  some  structural  constraints  on  the  Slater  determinants,  such 
as  orthogonality  of  orbitals  or  an  excitation  pattern.  As  the  requested  accuracy  increases, 
these  structural  constraints  trigger  an  explosion  in  the  number  of  determinants  used,  making 
the  computation  intractable  for  high  accuracy. 

The  a  priori  structural  constraints  present  in  Cl-like  methods  also  force  the  wavefunction 
to  comply  with  such  structure,  whether  or  not  it  really  is  the  case.  For  example,  if  you  use  a 
method  that  approximates  the  wavefunction  as  a  linear  combination  of  a  reference  state  and 
excited  states,  you  could  not  learn  that  the  wavefunction  is  better  approximated  as  a  linear 
combination  of  several  non-orthogonal,  near-reference  states.  Thus  the  choice  of  numerical 
method  is  not  just  a  computational  issue;  it  can  help  or  hinder  our  understanding  of  the 
wavefunction. 

For  these  reasons,  our  goal  is  to  construct  an  adaptive  numerical  method  without  im¬ 
posing  a  priori  structural  constraints  besides  that  of  antisymmetry.  In  this  paper  we  derive 
and  present  an  algorithm  for  approximating  a  wavefunction  with  an  unconstrained  sum  of 
Slater  determinants,  with  fully-adaptive  single-electron  functions.  In  particular  we  discard 
the  notions  of  reference  state  and  excitation  of  orbitals.  The  functions  comprising  the  Slater 
determinants  need  not  come  from  a  particular  basis  set,  be  orthogonal,  or  follow  some  ex¬ 
citation  pattern.  They  are  computed  so  as  to  optimize  the  overall  representation.  In  this 
respect  we  follow  the  philosophy  of  separated  representations  [4,  5] ,  which  allow  surprisingly 
accurate  expansions  with  remarkably  few  terms. 

Our  construction  generates  a  solution  using  an  iterative  procedure  based  on  nonlinear 
approximations  via  separated  representations.  To  accomplish  this  nonlinear  approximation, 
we  derive  a  system  of  integral  equations  that  describe  the  fully-correlated  many-particle 
problem.  The  computational  core  of  the  method  is  the  repeated  construction  and  solution 
of  a  matrix-integral  system  of  equations. 

Specifically,  our  approach  has  the  following  distinctive  features: 

•  We  use  an  adaptive  representation  for  single-electron  functions,  but  our  method  does 
not  depend  on  its  details. 

•  We  use  an  integral  formulation  of  the  multiparticle  Schrodinger  equation  and  a  Green’s 
function  iteration  to  converge  to  the  ground-state  wavefunction.  The  Green’s  function 
is  decomposed  and  applied  using  separated  approximations  obtained  by  expanding  the 
kernel  into  Gaussians. 
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•  We  use  a  variant  of  the  so-called  alternating  least  squares  algorithm  to  reduce  the 
error  of  our  approximation  using  a  sum  of  a  given  number  of  Slater  determinants. 

•  We  compute  antisymmetric  inner  products  involving  portions  of  the  Hamiltonian  oper¬ 
ator  by  reducing  them  to  formulas  involving  only  combinations  of  standard  integrals. 
In  particular,  we  avoid  the  direct  application  of  the  electron-electron  potential  and 
instead  compute  convolutions  with  the  Poisson  kernel. 

By  doing  this,  we  hope  to  represent  the  effects  of  correlations  in  the  most  natural  and 
concise  way  possible,  thus  providing  both  computational  efficiency  and  physical  insight.  We 
believe  that  this  algorithm  and  the  system  of  integral  equations  underlying  it  provide  the 
foundation  for  a  new  approach  to  solving  the  multiparticle  Schrodinger  equation.  We  defer 
to  the  sequels  several  important  issues,  such  as  algorithmic  size-consistency/extensivity  and 
the  treatment  of  the  inter-electron  cusp. 

In  Section  II  we  formulate  the  problem  more  carefully,  make  precise  some  of  the  state¬ 
ments  that  we  made  in  this  introduction,  and  give  a  high-level  description  of  the  method. 
We  then  present  the  derivations  and  proofs  in  the  following  sections. 


II  Problem  Formulation  and  Description  of  the  Method 

II.  1  Formulation  of  the  Problem 

We  consider  the  time- independent,  nonrelativistic,  multiparticle  Schrodinger  equation,  and 
fix  the  nuclei  according  to  the  Born-Oppenheimer  approximation,  so  the  equation  describes 
the  steady  state  of  an  interacting  system  of  electrons.  For  each  of  the  N  electrons  in  the 
system  there  are  three  spatial  variables  r  =  ( x ,  y,  z )  and  a  discrete  spin  variable  er  taking  the 
values  {— 2 ,  ^},  which  we  combine  and  denote  (r,  o)  by  7.  The  Hamiltonian  operator  Tt  is  a 
sum  of  a  kinetic  energy  operator  T,  a  nuclear  potential  operator  V,  and  an  electron-electron 
interaction  operator  W,  defined  in  atomic  units  by 


n=r+v+w 


N  N  N  N 

~2^Ai  +  5>(ri)+ 

i= 1  i= 1  i=l  jyii 


1 


(1) 


where  is  the  three-dimensional  Laplacian  acting  in  the  variable  ry  and  u(r)  is  a  sum 
of  terms  of  the  form  —  Za/||r  —  Ra||  from  a  nucleus  at  position  Ra  with  charge  Za.  The 
antisymmetric  eigenfunctions  of  H  represent  electronic  states  of  the  system  and  are  called 
wavefunctions.  Antisymmetric  means  that  under  the  exchange  of  any  two  coordinates,  the 
wavefunction  is  odd,  e.g.  ^(72, 71, . . .)  =  —^(71,72,...).  The  bound-state  wavefunctions 
have  negative  eigenvalues,  and  are  of  greatest  interest.  We  will  focus  on  the  ground-state 
wavefunction,  which  has  the  most  negative  eigenvalue,  although  the  techniques  can  be  used 
for  other  states.  In  summary,  our  goal  is  to  find  E  and  ip,  with  E  the  most  negative 
eigenvalue  in 

Hip  =  Eip  ,  (2) 

subject  to  the  antisymmetry  condition  on  ip.  Analytic  methods  can  give  qualitative  results 
about  the  solutions,  and  determine  limiting  cases,  but  most  quantitative  results  must  be 
obtained  numerically.  Although  the  equation  is  ‘just’  an  eigenvalue  problem,  its  numerical 
solution  presents  several  serious  difficulties,  among  them  the  large  number  of  variables  and 
the  antisymmetry  condition  on  the  solution.  The  simplest  method  that  addresses  these  two 
difficulties  is  Hartree-Fock  (HF)  (see  e.g.  [28]),  which  uses  the  antisymmetrization  of  a  single 
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product,  called  a  Slater  determinant,  to  approximate  the  IV-particle  wavefunction,  i.e. 


N 

0HF  = 

i- 1 


1 

m 


Any  antisymmetric  approximation  ip  to 


(pl  (71) 

^1(72) 

•••  <Pi(Jn) 

<p2  (7l) 

02(72) 

•••  02(7iv) 

<Pn(  7l) 

<Pn(  72) 

•••  0iv(7iv) 

the  wavefunction  ip 

can  be  substituted  into 

CH4’,ip) 

<0,  0) 


(3) 


(4) 


where  (•,  •)  is  the  usual  inner  product,  to  obtain  an  estimate  for  E.  This  estimate  gives  an 
upper  bound  on  the  lowest  value  of  E  that  solves  (2).  Substituting  (3)  into  (4),  one  can 
iteratively  solve  for  (pi  to  minimize  (4).  The  resulting  ^hf  will  best  approximate  ip,  in  the 
sense  of  providing  the  best  estimate  (4). 

To  improve  upon  HF,  it  is  natural  to  consider  the  antisymmetrization  of  a  sum  of  prod¬ 
ucts 

r  N 

0(r)  = ,  (5) 

;= i  i—i 

which  could  also  be  written  as  a  sum  of  Slater  determinants.  The  coefficients  si  are  in¬ 
troduced  in  order  to  have  \\<p\\\  =  1.  Many  methods  are  based  on  this  form,  but  they  use 
it  in  different  ways.  The  Configuration  Interaction  (Cl)  method  (see  e.g.  [57])  chooses  the 
functions  <p \  from  a  preselected  master  set  of  orthogonal  functions  and  decides  on  a  large 
number  r  of  combinations  to  consider,  based  on  excitation  level.  Substituting  (5)  into  (4) 
leads  to  a  matrix  eigenvalue  problem  that  can  be  solved  for  the  scalar  coefficients  s;.  The 
Multi-Configuration  Self-Consistent  Field  (MCSCF)  method  (e.g.  [20,  11])  solves  for  the 
master  set  of  orthogonal  functions  as  well  as  the  scalar  coefficients.  There  are  numerous 
variations  and  combinations  of  these  methods,  too  many  to  describe  here. 


II. 1.1  What  is  New  Here 

In  this  work  we  construct  and  demonstrate  a  method  that  also  uses  a  wavefunction  of  the 
form  (5)  but  without  constraints  on  the  </>(.  We  remove  both  structural  constraints,  such  as 
an  excitation  pattern  or  orthogonality  between  single-electron  functions,  and  representation 
constraints,  such  as  those  imposed  by  using  a  predetermined  basis  set. 

Many  methods  (e.g.  [55,  47,  39,  1,  19,  15,  18,  2,  16,  60,  41])  have  loosened  the  constraints 
on  the  Slater  determinants  in  one  way  or  another,  often  with  encouraging  results.  These 
works,  however,  only  partially  removed  the  constraints,  and  so,  we  claim,  did  not  achieve 
the  full  potential  of  an  unconstrained  approximation.  By  removing  these  constraints  we 
hope  to  produce  much  better  approximations  at  much  smaller  separation  rank  r  than  ex¬ 
isting  methods  allow.  We  also  hope  to  provide  new  perspective  from  which  to  analyze  and 
understand  the  wavefunction,  free  from  the  biases  that  physical  intuition  imposes. 

Our  hopes  are  based  on  our  work  in  [4,  5,  43],  where  we  developed  general  methods  to 
represent  and  compute  with  functions  and  operators  in  many  dimensions.  We  used  sums 
of  separable  functions,  dubbed  separated  representations,  similar  to  (5).  We  found  rather 
natural  examples  where  removing  constraints  produces  expansions  that  are  exponentially 
more  efficient,  i.e.  r  =  N  instead  of  2N  or  r  =  log  IV  instead  of  N.  For  example,  in  our 
approach  we  can  have  a  two-term  representation 

N  N 

Ip  =  -4]^[  Mli)  +  +  <Pi+N(li)) 

i= 1  i= 1 


(6) 
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where  }^= 1  form  an  orthogonal  set.  To  represent  the  same  function  as  (6)  while  imposing 
the  constraint  that  factors  come  from  a  master  orthogonal  set  would  force  one  to  multiply 
out  the  second  term,  and  thus  use  a  representation  with  2N  terms. 

At  present  we  have  no  proof  that  the  wavefunction  is  well-approximated  by  a  structure 
that  would  benefit  from  the  removal  of  constraints.  The  size  r  needed  in  practice,  and 
how  it  depends  on  the  various  parameters  in  the  problem,  is  thus  still  an  open  question. 
In  [4,  5,  43],  the  most  interesting  examples  came  from  “reverse-engineering”  the  numerical 
results  to  obtain  formulas  and  proofs.  We  therefore  expect  that  the  tools  we  provide  here 
will  allow  an  exploration  of  the  wavefunction,  perhaps  revealing  unexpected  structure,  and 
a  strategy  for  a  proof. 

II. 2  Description  of  the  Algorithm 

The  removal  of  constraints  in  (5),  and,  thus,  the  basis  sets,  coefficients,  and  other  structure 
that  went  along  with  them,  also  eliminates  the  conventional  strategies  for  constructing  (5)  to 
minimize  (4).  It  leads  one  to  consider  how  one  would  compute  the  ground-state  wavefunction 
if  its  numerical  representation  were  not  an  issue.  We  choose  to  use  an  integral  iteration, 
which  we  sketch  in  Section  II. 2.1.  In  Appendix  A  we  sketch  an  alternative  iteration  based 
on  gradient  descent. 

To  use  the  form  (5)  we  must  choose  some  value  of  r,  which  determines  the  quality  of 
the  approximation.  In  Section  II. 2. 2  we  show  how  to  incorporate  a  nonlinear  fitting  step 
within  the  integral  iteration  in  order  to  maintain  fixed  r.  Accomplishing  this  fitting  requires 
a  significant  amount  of  machinery,  which  makes  up  the  body  of  the  paper.  Eventually  one 
would  want  to  adaptively  determine  r,  but  we  do  not  address  that  issue  here. 

II. 2.1  A  Green’s  Function  Iteration 

The  eigenvalue  equation  (2)  contains  the  differential  operator  H,  which  has  both  the  dis¬ 
crete  negative  eigenvalue(s)  that  we  are  interested  in  and  unbounded,  continuous,  positive 
spectrum.  In  [31,  32]  this  differential  equation  was  reformulated  as  an  integral  equation, 
producing  an  operator  with  only  discrete,  bounded  spectrum.  Such  integral  formulations 
are  in  general  far  superior  to  differential  formulations,  since,  e.g.  numerical  noise  is  sup¬ 
pressed  rather  than  amplified.  An  iteration  based  on  the  integral  formulation  with  Green’s 
functions  was  introduced  in  [31,  32]  and  used  in  e.g.  [12,  26].  A  rigorous  analysis  of  this 
iteration  is  given  in  [44]  based  on  classical  theorems  from  [30,  33,  52,  53,  54].  In  this  section 
we  review  this  iteration,  and  then  modify  it  in  Section  II. 2. 2  to  preserve  our  wavefunction 
representation  (5). 

Define  the  Green’s  function 

^(T^I)-1,  (7) 

for  /.i  <  0,  and  consider  the  Lippmann-Schwinger  integral  equation 

A^  =  -S/*[(V  +  W)Vv,].  (8) 

The  subscript  /r  on  Ap  and  i/v  are  to  emphasize  the  dependence  of  the  eigenvalues  and 
eigenfunctions  on  /i.  The  operator  ^[(V+W)]  is  compact,  so  (8)  has  only  discrete  spectrum. 
If  fi  =  E,  then  there  is  an  eigenvalue  AM  =  1  and  the  corresponding  eigenfunction  '<A  of  (8) 
is  the  desired  ground-state  eigenfunction  of  (2),  as  one  can  see  by  rearranging  (8)  into  (2). 
We  note  that  other  eigenfunctions  may  be  obtained  by  deflation. 

When  /i  =  E.  A/x  =  1  is  the  largest  eigenvalue,  so  a  simple  iteration  like  the  power  method 
yields  the  desired  ground-state  eigenfunction.  The  eigenvalues  A^  depend  analytically  on  /i, 
so  when  [i  is  sufficiently  close  to  E  the  power  method  will  still  yield  an  eigenfunction  of  (8) 
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with  energy  near  the  minimum  of  (4).  From  and  A ^  one  can  construct  an  update  rule 
for  /i,  based  for  example  on  applying  Newton’s  method  to  solve  AM  =  1. 

The  convergence  rate  of  the  power  method  to  produce  and  AM  is  linear,  and  depends, 
as  usual,  on  the  gap  between  the  eigenvalues  in  (8).  The  convergence  rate  of  Newton’s 
method  to  solve  A^  =  1  is  quadratic,  so  /r  will  converge  to  E  quadratically,  provided  that 
Am  and  t/V-  have  been  found  at  each  step.  In  the  practical  use  of  this  approach,  one  does 
not  wait  for  the  power  method  to  converge  at  each  step,  but  instead  intertwines  it  with  the 
update  of  fi.  Beginning  with  an  approximation  to  the  energy  /To  ~  E  and  an  approximate 
wavefunction  if>o,  one  converts  (8)  to  an  iteration 

i’n  =  -Gnn[(V  +  W)i>n\.  (9) 

After  each  iteration  one  normalizes  by  setting 

tpn+l  =  i’n/Wi’nW-  (10) 

Following  the  approach  of  [26],  we  can  use  the  update  rule 

/i„+l  =  Hn  -  ((V  +  ~  '0n)/||'0n||2  ,  (H) 

which  is  equivalent  to  using  Newton’s  method. 


II. 2. 2  Approximating  with  Fixed  Separation  Rank  r 

We  restrict  the  method  to  approximate  wavefunctions  of  the  form  (5),  with  r  fixed,  by 
replacing  the  definition  of  ipn  in  (9).  We  define  ipn  to  be  the  function  of  the  form  (5)  that 
minimizes  the  (least-squares)  error 

||V’n-(-£M„[(V  +  W)'0n])||.  (12) 


Since  using  (12)  instead  of  (9)  introduces  an  error,  the  update  rule  (11)  may  no  longer  give 
quadratic  convergence,  and  in  any  case  is  not  expected  to  converge  to  the  true  energy.  One 
may  choose  to  replace  the  update  rule  (11)  with  the  more  robust  but  slower  converging  rule 


Mn+1  “  |  t-V.  ill2 


(13) 


which  is  based  on  (4).  Other  rules  may  be  possible  as  well.  At  present  we  do  not  have 
enough  numerical  experience  to  decide  which  rule  to  prefer. 

The  Green’s  function  iteration  itself  does  not  enforce  the  antisymmetry  condition.  In 
order  to  assure  convergence  to  an  antisymmetric  solution,  we  use  the  pseudo-norm  induced 
by  the  pseudo  inner  product  (•,  -)^  =  (A(-),  A(-)),  as  we  did  in  [5]. 

The  least-squares  problem  (12)  is  non-linear,  and  so  very  difficult  in  general.  To  simplify 
notation  in  the  description  of  our  method,  we  now  suppress  the  index  n  in  (12)  and  consider 
a  single  problem  of  that  form.  We  begin  by  setting  if)  =  %/>,  and  then  iteratively  improve  to 
reduce  (12).  Although  we  can  see  several  strategies  for  improving  //>,  for  concreteness  we  will 
restrict  our  description  to  the  strategy  most  similar  to  [5] .  To  improve  the  approximation  if) 
we  loop  through  the  variables  (electrons).  The  functions  in  variables  other  than  the  current 
variable  are  fixed,  and  the  functions  in  the  current  variable  are  modified  to  minimize  the 
overall  error  (12).  The  error  (12)  depends  linearly  on  the  functions  in  a  single  variable, 
so  the  minimization  becomes  much  easier.  This  general  Alternating  Least-Squares  (ALS) 
approach  is  well-known  (see  e.g.  [27,  36,  38,  10,  14,  58]).  Although  to  minimize  (12)  one 
may  need  to  loop  through  the  variables  multiple  times,  it  appears  to  be  more  cost  effective 
to  loop  only  once  and  then  do  the  next  Green’s  function  iteration.  We  alternate  through  the 
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directions,  but  for  ease  of  exposition  we  describe  the  k  =  1  case.  So,  <f)lk  is  fixed  for  k  >  1, 
and  we  will  solve  for  the  values  of  (j>\  for  all  l. 

To  refine  in  the  current  variable,  we  set  up  and  solve  a  linear  least-squares  problem.  The 
normal  equations  for  a  least-squares  problem  are  derived  by  taking  a  gradient  with  respect 
to  the  free  parameters  and  setting  the  result  equal  to  zero.  As  long  as  the  approximating 
function  is  linear  and  not  degenerate  in  these  parameters,  the  resulting  equations  are  linear 
and  have  a  unique  solution,  which  minimizes  the  error  with  respect  to  these  parameters. 
Usually  these  free  parameters  are  coefficients  of  the  representation  in  some  fixed  basis.  For 
example,  to  find  the  coefficients  {cj}  to  minimize 


f~^2ci9i  =  (/  -  ^Ciguf  -  ^2 1 


construct  the  normal  equations 


Ax  =  b . 


A(k,i)  =  (gk,9i)  and  b(k)  =  ( gk,f ) ,  (16) 

solve  them,  and  set  c*  =  x (i).  Instead  of  using  coefficients  in  some  basis  as  our  parameters, 
we  take  the  parameters  to  be  the  point  values  of  our  functions  d>\ ,  so  that  the  gradient 
becomes  a  variational  derivative.  Formally,  we  consider  a  basis  of  delta  functions  S(j  —  •) 
and  let  their  coefficients  be  our  parameters.  We  still  obtain  linear  normal  equations  (15),  but 
now  b  and  x  are  vectors  of  functions,  and  A  is  a  matrix  of  integral  operators.  Specifically, 
b(l )  is  a  function  of  7,  x(l')  is  a  function  of  7',  and  A(l,  l')  is  an  integral  operator  mapping 
functions  of  Y  to  functions  of  7.  The  kernels  in  A  are  formally  defined  by 


a(i,i')(  7,70 =stsi'  (<5(7-71)  n$(7»),  <j(7/  -71)  n$  (7;) 


and  the  functions  in  b  are  defined  by 


b(l){Y  =  Si  ^2  sm  (  5(7  -  7l)  <Yi (7* ) ,  -0/^  +  W]  II  $"(7 i) 


Once  we  solve  (15),  we  set  <j>\  =  x(l).  To  enforce  the  normalization  convention  ||^||  =  1  we 
can  divide  <j>\  by  its  norm  and  incorporate  the  norm  into  s;. 

To  solve  the  matrix-integral  system  (15),  we  need  an  iterative  method  for  solving  linear 
systems  that  uses  only  operations  compatible  with  integral  operators,  such  as  matrix- vector 
products,  vector  scales  and  additions,  and  vector  inner  products.  Typically  the  matrix  A 
in  normal  equations  is  positive-definite.  Our  operator  A  is  only  semidefinite  due  to  the 
nullspace  in  the  antisymmetric  pseudonorm.  Fortunately,  b  was  computed  with  the  same 
pseudonorm  and  has  no  component  in  the  nullspace  of  A,  so  we  can  still  use  methods  for 
positive-definite  matrices.  Based  on  these  considerations,  we  choose  to  use  the  Conjugate 
Gradient  iterative  method  (see  e.g.  [21])  to  solve  (15).  One  initializes  with  r  =  b  —  Ax, 
v  =  r,  and  c  =  (r,r),  and  then  the  core  of  the  method  is  the  sequence  of  assignments 
z  <—  Av,  £  <—  c/(v,  z),  x  <—  x  +  tv,  r  <—  r  —  fz,  d  <—  (r,  r),  v  4—  r  +  (d/c)v,  and  c  <—  d, 
applied  iteratively. 

One  advantage  of  using  this  iterative  method  with  integral  operators  is  that  our  algorithm 
does  not  rely  on  any  particular  basis.  The  representation  of  x  can  naturally  be  adaptive  in 
7,  for  example  refining  near  the  nuclei  as  indicated  by  the  refinement  in  b.  We  assume  the 
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availability  of  some  adaptive,  high-accuracy  representation  for  single-electron  functions,  such 
as  the  polynomial  multiwavelet  representation  demonstrated  in  [25,  26],  which  effectively 
eliminates  the  basis-set  error.  For  the  estimates  of  computational  cost,  we  use  M  to  denote 
the  cost  to  represent  a  function  of  7,  or  integrate  such  a  function.  The  antisymmetry 
constraint  requires  N  <  M ,  and  in  general  we  expect  M  to  be  much  larger  than  N. 

11. 2. 3  Summary  of  the  Remainder  of  the  Paper 

The  core  of  the  paper  is  the  development  of  the  methods  needed  to  construct  A  in  (17)  and 
b  in  (18).  First,  in  Section  III,  we  develop  the  machinery  and  algorithms  for  computing 
antisymmetric  inner  products  involving  the  operators  T,  V,  and  W.  Our  formulation  uses 
low-rank  perturbations  of  matrices,  thus  avoiding  cofactor  expansions.  We  also  avoid  explicit 
construction  of  W  by  incorporating  its  effect  via  spatial  convolutions  with  the  Poisson  kernel 
in  three  dimensions.  Second,  in  Section  IV,  we  show  how  to  compute  antisymmetric  inner 
products  involving  these  operators  and  the  delta  function  5(7  —  71).  Again  the  key  is  to  use 
low-rank  perturbations  of  matrices. 

In  Section  V  we  assemble  all  our  tools  to  demonstrate  how  to  perform  our  main  algorithm, 
and  in  particular  how  to  construct  A  in  (17)  and  b  in  (18).  We  also  gather  the  computational 
cost  for  the  whole  method.  The  cost  depends  on  the  number  of  electrons  N ,  the  separation 
rank  r,  the  one-particle  representation  cost  M,  the  number  of  Green’s  function  iterations  / 
(see  Section  II. 2.1),  and  the  number  of  conjugate  gradient  iterations  S  (see  Section  II. 2. 2). 
Although  S  in  theory  could  be  as  many  as  the  number  of  degrees  of  freedom  rM,  we  have  a 
very  good  starting  point,  and  so  expect  only  a  very  small  constant  number  to  be  needed.  We 
use  M  log  M  to  denote  the  generic  cost  to  convolve  a  function  of  7  with  the  Poisson  kernel 
1/ 1| r || .  A  Fourier-based  Poisson  solver  on  a  uniform  grid  would  achieve  this  complexity;  for 
adaptive  methods  such  as  we  use  it  is  very  difficult  to  state  the  cost  (see  [7,  17]).  We  use  L 
to  denote  the  number  of  terms  used  to  approximate  the  Green’s  function  to  relative  error  e 
with  Gaussians,  and  prove  in  Section  V.l  that  L  =  0((lne)2)  independent  of  /i  and  N.  The 
final  computational  cost  is  then 

0(Ir2N2  [L(N  +  M  log  M)  +  S{N  +  M)}).  (19) 

For  comparison,  the  cost  to  evaluate  a  single  antisymmetric  inner  product  via  Lowdin’s  rules 
is  0(N2{N  +  M)). 

11. 3  Further  Considerations 

We  have  implemented  the  method  developed  here  and  tested  it  sufficiently  to  verify  the 
correctness  of  the  algorithm  as  presented.  The  numerical  results  are  too  preliminary  to  allow 
us  to  make  any  particular  claims  at  this  point,  however,  so  we  will  present  them  separately. 
The  linear  algebra  accelerations  based  on  Appendix  B  have  not  yet  been  implemented. 

We  develop  the  method  in  terms  of  the  total  variable  7  without  specifying  the  spin 
states.  If  a  specific  spin  state  is  imposed  on  our  initial  trial  wavefunction  ipo,  the  iteration 
will  preserve  this  state. 

The  representation  (5)  does  not  account  for  the  inter-electron  cusp  (see  e.g.  [56,  46, 
35,  49,  50,  34,  37]),  and  thus  we  cannot  hope  to  achieve  small  error  e  in  the  wavefunction 
with  small  r.  As  with  Configuration  Interaction  methods,  we  may  still  be  able  to  achieve 
small  error  in  the  energy  difference  of  two  systems,  which  is  often  the  quantity  of  interest 
in  Chemistry.  For  the  current  work,  we  fix  r  and  adapt  and  Si  to  minimize  the  error 

e,  rather  than  fixing  e  and  adaptively  determining  r.  We  are  developing  an  extension  to  (5) 
that  incorporates  the  cusp,  and  hope  to  achieve  small  error  e  through  it. 
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Similarly,  (5)  is  not  size-consistent/extensive,  and  thus  is  not  suitable  for  large  systems. 
We  are  also  developing  an  extension  to  (5)  suitable  for  large  systems,  and  hope  to  achieve 
linear  scaling  through  it. 

Although  we  have  focused  on  the  multiparticle  Schrodinger  equation,  the  tools  that  we 
have  developed  are  another  step  towards  general-purpose,  automatically  adaptive  methods 
for  solving  high-dimensional  problems. 


Ill  Antisymmetric  Inner  Products 

In  this  section  we  develop  methods  for  computing  antisymmetric  inner  products  involving 
W,  V,  and  T.  For  this  purpose,  after  setting  notation,  we  develop  methods  for  computing 
with  low  rank  perturbations  of  matrices,  review  the  antisymmetry  constraint  and  define  a 
notion  of  maximum  coincidence.  With  these  tools  we  then  derive  the  main  formulas. 


III.  1  Notation 

We  denote  a  column  vector  with  suppressed  indices  by  F  and  with  explicit  indices  by  F(i). 
We  denote  its  conjugate  transpose  by  F*.  We  use  e;  to  denote  the  column  vector  that 
is  one  in  coordinate  i  and  zero  otherwise.  A  linear  operator  is  written  C.  We  denote  a 
matrix  with  suppressed  indices  by  L  and  with  explicit  indices  by  Recalling  that 

r  =  ( x ,  y,  z)  G  R3,  we  combine  spatial  integration  with  summation  over  spins  and  define  the 
integral 

f  f('y)d'Y=  [  /(r’CT)dr-  (20) 

J  a&{-l/2,l/2}J 

We  define  the  action  of  the  single-electron  kinetic  and  nuclear  potential  operators  by 

{%  +  V*)  [/]( 7)  =  /( 7)  =  +  v(r)j  /(r>  a)-  (21) 


In  what  follows  we  will  reduce  the  action  of  the  inter-electron  potential  operator  W  to 
convolutions  with  the  Poisson  kernel,  so  we  define 


W.  [/]« 


/( i)H 


E 

<r'e{  — 1/2.1/2} 


(22) 


We  allow  these  operators  to  be  applied  componentwise  to  vectors  and  matrices  of  functions. 
Next,  we  define  <I>  =  TIE'MtOj  so  f°r  example  we  can  write  instead  of 

■  We  also  associate  with  the  product  <I>  a  vector  of  N  func¬ 
tions  of  a  single  variable, 


$  = 


(23) 


4>n 


We  can  then,  for  example,  construct  a  new  vector  of  functions  0  by  applying  a  matrix  to 
an  old  one,  as  in  0  =  L”1^.  Although  we  do  linear  algebra  operations  on  these  vectors, 
we  note  that  $  +  $  does  not  correspond  to  $  +  d>,  so  there  is  not  a  true  vector-space 
structure.  Our  formulas  contain  fairly  complicated  expressions  with  such  vectors,  such  as 
J  $>*WV  [0$*]0d7.  To  parse  this  expression,  we  note  that  0  is  a  column  vector  of  functions 
and  is  a  row  vector  of  functions,  so  ©<I>*  is  a  matrix  of  functions.  Then  Wv  [©<!?*]  is 
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still  a  matrix  of  functions,  but  applying  <&*  on  the  left  and  ©  on  the  right  yields  a  single 
function,  which  is  integrated  in  the  implied  variable  7  to  yield  a  number.  When  explicit 
specification  of  the  variable  involved  is  needed,  the  notation  $(7)  indicates  that  the  single 
variable  7  is  used  in  all  the  functions. 

III. 2  Determinants  of  Low-Rank  Perturbations  of  Matrices 

Since  the  antisymmetric  inner  product  involves  determinants,  we  will  use  some  linear  algebra 
relations  for  them.  Proposition  2  in  this  section  is  used  heavily,  and  is  the  key  to  avoiding 
rather  unpleasant  cofactor  expansions. 

Proposition  1  (Determinant  via  Schur  Complement)  Let  A  be  a  nonsingular  square 
matrix ,  D  a  square  matrix,  and  B  and  C  matrices  of  appropriate  size.  Then 

=  |A|  ID-CA-1!!  .  (24) 


A  1 
C  B 


Proof:  (see  e.g.  [51])  It  is  easy  to  verify  directly  that 


'  A  B  ' 

O 

t=l 

'A  0 

1=1 

> 

a 

C  B 

CA"1  I 

0  ID  -  CA”1! 

1=1 

O 

_ 1 

(25) 


Since  the  determinants  of  the  first  and  third  matrices  are  equal  to  one,  the  determinant  of 
the  middle  matrix  gives  the  desired  result.  □ 


Proposition  2  (Determinant  of  a  Perturbation  of  the  Identity)  Le<{ug}^=1  and{vq} 
be  two  sets  of  vectors  of  the  same  length,  and  ugv*  denote  the  outer  product  ofuq,  and  vq. 
Then 


Q 

I  +  U«V9 


1+V^Ui  V  (  u2 
V2U1  1  +  V2U2 


viuQ 

V>Q 


(26) 


q= 1 


VQU1  VQU2  •••  1+VqUq 


Proof:  Let  U  be  the  matrix  with  the  vectors  {u9}  as  its  columns,  and  V  the  matrix  with 
the  vectors  {vg}  as  its  columns.  Note  that  U  and  V  are  of  the  same  size.  By  Proposition  1 
we  have 


I  U 
-V*  I 


|I  +  V*U|, 


(27) 


which  evaluates  to  the  right  side  of  (26). 
we  have 

I 

-V* 


Exchanging  the  roles  of  A  and  ID  in  Proposition  1 
^  =  |I  +  UV*| ,  (28) 


which  evaluates  to  the  left  side  of  (26).  □ 

The  Q  =  1  case  is  well-known  (see  e.g.  [51])  but  we  have  not  found  the  general  case  in  the 
literature. 
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III. 3  The  Modified  Pseudo-inverse 

The  singular  value  decomposition  (SVD)  (e.g.  [21])  of  a  N  x  N  matrix  is 

N 

A  =  J2  SjUjV*  =  USV*  ,  (29) 

t= i 

where  the  matrices  U  and  V  are  unitary  and  the  singular  values  {s^}  are  non-neganive  and 
in  descending  order.  The  left  singular  vectors  {u,}  form  an  orthonormal  set,  as  do  the  right 
singular  vectors  {v;}.  The  pseudo-inverse  is  defined  as 

N-Q 

Af  =  ^  s^ViU* ,  (30) 

i= l 


where  Q  is  the  dimension  of  the  (numerical)  nullspace.  We  also  define  a  projection  matrix 
onto  the  nullspace 


Definition  3 


and  a  modified  pseudo-inverse 


N 

A±  =  y  v*u* 

i=N  —  Q+ 1 


Definition  4  (Modified  Pseudo-Inverse) 


A*  =  Af  +  A1- . 


(31) 


(32) 


Note  that  A-1  and  thus  A*  are  not  uniquely  dehned  since  the  choice  of  basis  for  the  nullspace 
is  not  unique.  For  our  purposes  any  consistent  choice  works.  The  modified  pseudo-inverse 
behaves  much  like  the  pseudo-inverse,  but  always  has  a  non-zero  determinant, 

iunvin«  ^°-  (33) 

Si#0  / 


III. 4  The  Antisymmetrizer  and  Lowdin’s  Rule 

Given  a  separable  function,  its  antisymmetric  projection  can  be  found  by  applying  the 
antisymmetrizer  A  (see  e.g.  [48]),  also  called  the  skew-symmetrization  or  alternation  (see 
e.g.  [45,  51]),  resulting  in  a  Slater  determinant.  In  the  vector  notation  (23),  we  have 


01  (7l) 

0i(72)  • 

••  0i(7iv) 

1 

1 

02(7l) 

02(72)  • 

••  02(7 n) 

m  II  *(•») 

-  *(w)ll  =  jvr 

0iv(  7i) 

0Jv(  72)  ' 

■  ■  0jv(7iv) 

One  cannot  explicitly  form  a  Slater  determinant  for  large  N  since  it  would  have  N\ 
terms.  However,  one  can  compute  the  antisymmetric  pseudo  inner  product 
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where  the  first  equality  is  a  definition  and  the  others  follow  since  A  is  an  orthogonal  pro¬ 
jector.  It  is  not  a  true  inner  product  because  it  has  a  nullspace.  To  compute  (35),  first 
construct  the  matrix  L  with  entries 


L(i,j)  =  (<li,<f>j)  (36) 

at  cost  0(N2M).  Then  use  (4>,  =  (-4$,  $)  and  move  the  integrals  inside  the  determi¬ 

nant  to  obtain 

($,$U  =  ^|L|,  (37) 

which  is  the  so-called  Lowdin’s  rule  (e.g.  [40,  48]).  Since  L  is  an  ordinary  matrix,  its 
determinant  can  be  computed  with  cost  0(N3)  (or  less).  The  denominator  Nl  need  never 
be  computed,  since  it  will  occur  in  every  term  in  our  equations,  and  so  cancels. 

Our  method  for  enforcing  the  antisymmetry  constraint,  as  described  in  [5],  is  to  use  the 
pseudo-norm  based  on  the  antisymmetric  inner  product  (•,  -)_4  for  the  least-squares  fitting 
(12). 

III. 5  Maximum  Coincidence 

Consider  two  products,  $  =  n£Li  <M7*)  aRd  ^  =  TT^i  <M7«)>  stored  in  the  vector  notation 
of  (23)  as  #  and  3>.  To  specify  which  functions  were  used  to  compute  L  in  (36),  we  use  the 
notation  L($,  <b).  The  matrix  of  inner  products  L  =  L(4>,  $)  is  in  general  full.  Defining 

0  =  L-14> ,  (38) 


we  have 

710  =  ^|[  (L-14)(7l)  •••  (L"1*)(7jv)  ]| 

=  |L-1|^|[  *(7l)  ...  4>(7iV)  ]|  =  |L-V$.  (39) 

Thus  the  antisymmetrizations  of  $  and  0  are  the  same  up  to  a  constant,  and  we  can  use  0 
instead  of  $  in  calculations.  The  advantage  of  using  0  is  that  the  resulting  matrix  of  inner 
products  L  =  L(0,  $)  =  I;  in  other  words,  we  have  the  biorthogonality  property  ( 9i,<j>j )  = 
Sij.  To  show  this,  write  the  matrix  L  as  f  where  the  integration  is  elementwise. 

Substituting  for  0,  we  have  J Since  the  integration  is  elementwise  it  commutes 
with  L”1  and  we  have  L”1  f  •fr&*dj  =  L-1L  =  I.  The  computational  cost  to  construct  0 
is  0(N2(N  +  M)). 

When  the  matrix  L  in  (36)  is  singular,  we  define  0  =  14  4>  using  the  modified  pseudo¬ 
inverse  of  Definition  4.  By  the  same  argument  as  before,  we  have  |Ll|-17l0  =  A&.  The 
matrix  J  ®$*dj  evaluates  to  L^L  =  I  —  ^i=Ar-Q+i  viv* -  For  notational  convenience  in 
later  sections,  we  will  re-index  our  singular  values  and  vectors  so  that  the  first  Q  generate 
the  nullspace,  rather  than  the  last  Q. 

Remark  5  Within  Configuration  Interaction  methods,  the  functions  in  $  and  $  are  taken 
from  a  master  set  of  orthonormal  functions,  and  0  is  simply  a  signed  permutation  of  $ 
so  that  4>j  =  0j  for  as  many  j  as  possible.  This  is  known  as  the  ‘maximum  coincidence  ’ 
ordering.  The  construction  we  use  generalizes  this  notion. 
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III. 6  Antisymmetric  Inner  Product  with  the  Electron-Electron  Po¬ 
tential  W  Present 

In  this  section  we  derive  formulas  for  computing  antisymmetric  inner  products  that  include 
the  electron-electron  interaction  potential.  Although  the  derivation  is  somewhat  messy,  the 
resulting  formulas  are  rather  clean,  and  we  use  them  verbatim  in  the  computations.  The 
main  ideas  are  given  in  this  section,  and  then  reused  in  later  sections  for  other  cases. 

Proposition  6  When  L  from  (36)  is  nonsingular, 


is  equal  to 

j  <r0wP  [«&*©]  -  [©<r]©d7 ,  (4i) 

where  0  =  L_14>. 


Proof:  Using  the  maximum-coincidence  procedure  in  Section  III. 5,  (40)  is  equal  to  |L|  (0,  W^) 
We  reorganize  and  find  that  we  must  compute 


#i(7i)  #1(72) 

#2(71)  #2(72) 


0n(  71)  9n{  72) 


9i(1n) 
$2(7  jv) 


dji  ■  ■  ■  djN  .  (42) 


9n(jn) 


By  moving  the  sum  outside  of  the  integral,  we  can  integrate  in  all  directions  except  7,  and 
7 v  Using  (9m,<j>n)  =  dmn,  we  obtain 


1JH 

2  AT! 


E 

*¥=j  ' 


r  —  r' 


=  IN  w 

2  N\  ^ 


1  • 

••  0i(7)01(7) 

•••  •• 

0  • 

••  ~4>i{l)9i(ct) 

•••  •• 

0  • 

••  ^*(7)^(7) 

•••  •• 

0  • 

••  <M7)0jv(7) 

•••  $j{l')9N{l')  •• 

1 

-  T-'ll 

1+  (</>;( 7)®(7)  - 

e0  e*  +  (4>jW)  0(7') 

djdy 


ej)ei 


d'yd'y1 .  (43) 


Since  the  inner  matrix  is  a  low-rank  perturbation  of  the  identity,  we  reduce  its  determinant 
using  Proposition  2  and  obtain 


1  M 

2  N\ 


1 


II  r  —  r'|| 


^(7)^(7') 


9i(l) 
9 3(1) 


0i(Y) 

9jW) 


d'yd'y' . 


(44) 


The  determinant  is  zero  if  j  =  i,  so  we  do  not  need  to  explicitly  prohibit  it  as  we  needed 
to  in  (43)  and  above.  The  antisymmetrization  has  caused  a  convenient  cancellation  of  a 
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fictitious  self-interaction,  and,  thus,  allowed  us  to  decouple  the  two  sums.  Expanding  out 
the  determinant  and  rearranging  the  terms,  we  obtain 


d 7 


IE 

2  IV! 


In  our  compact  notation,  this  yields  (41). 


r  —  r 


7TT  fyWWi 


dy .  (45) 

□ 


We  now  consider  the  computational  cost  of  (41).  In  the  first  term  in  (41),  computing 
3>*0  costs  O(NM),  applying  1VP  [•]  to  it  costs  0{M  log  M),  and  the  integral  in  7  costs 
O(M).  In  the  second  term,  $0*  costs  0(N2M),  applying  Wv  [•]  to  it  costs  0(N2M  log  M), 
applying  0*  and  then  $  costs  0(N2A1 ),  and  then  the  integral  in  7  costs  O(M).  Including 
the  cost  to  construct  0,  our  total  cost  is  0(N2(N  +  MlogM)). 


III. 6.1  The  Singular  Case 

In  this  section  we  investigate  the  case  when  the  matrix  L  from  (36)  is  singular.  Inserting 
the  definition  0  =  into  our  main  formula  (41),  we  have 


IJH 

2  IV! 


-$*>w 


L  -^dj. 


(46) 


In  terms  of  the  SVD  (29),  we  can  express 


N 

L'1  =  sjlyriu*j  and  N  =  |U||V*|  n  Si  •  (47) 

3— 1  i 

Inserting  these  expressions  into  (46),  we  have 


in,:  Si 


N 


N\ 


**E 


;wu;<i>nv 


i=i 

-$*>W 


N 


^EsEui.$ 


k—1 


N 


E^'v'u; 


U=1 

N  N 


N 


YjSklv*Uk®d  7 


k= 1 


n  »*  / 


j=l  fc= 1  i^j,k 


$*VkU*k$ 


u**$*vfc 


U  i$dj.  (48) 


If  L  is  singular  then  at  least  one  s*  is  zero,  and  only  terms  that  exclude  those  from  the 
product  in  (48)  are  nonzero.  Since  we  exclude  two  indices  in  the  product,  if  more  than  two 
Si  are  zero  then  the  entire  inner  product  is  zero.  If  exactly  two  are  zero  then  only  one  term 
in  the  sum  survives.  If  exactly  one  is  zero  then  we  can  simplify  from  a  double  to  a  single 
sum,  using  symmetry.  Recalling  the  modified  pseudo  inverse  from  Definition  4  and  sorting 
the  zero  s,;  to  the  beginning  for  notational  convenience,  we  obtain  the  following  propositions. 

Proposition  7  When  the  rank- deficiency  of  L  is  more  than  two,  the  antisymmetric  inner 
product  (fO)  evaluates  to  zero. 
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Proposition  8  When  the  rank- deficiency  of  L  is  equal  to  two ,  the  antisymmetric  inner 
product  (fO)  is  equal  to 


1 

|Lt|AT! 


3>*v2u^4> 


$*vi  wv 


$*v2u^  u Z&dj. 


(49) 


Proposition  9  When  the  rank- deficiency  of  L  is  equal  to  one,  defining  ©  =  or  0  = 
Lt$,  the  antisymmetric  inner  product  (fO)  is  equal  to 


1 

| u\m 


$’viu l$Wv  [$*©] 


§*viW, 


0c?7 . 


(50) 


In  computing  (49),  constructing  3?*Vi,  3»*v2,  u^,  and  u23>  costs  O(NM),  applying 
WT  [•]  costs  0{M\ogM)  and,  finally,  the  integral  in  7  costs  0(M).  In  computing  (50),  the 
first  term  costs  0{NM)  to  form  3>*0,  0{M  log  M)  to  apply  Wv  [•],  and  O(M)  to  integrate 
in  7.  The  second  term  costs  O(NM)  to  form  0(NM  log  M)  to  apply  Wv  [•],  O(NM) 

to  apply  0,  and  0{M)  to  integrate  in  7.  In  total,  the  computational  cost  for  the  singular 
cases  are  less  than  the  cost  of  the  nonsingular  case. 


Remark  10  In  the  Configuration  Interaction  context,  rank- deficiency  two  corresponds  to 
a  double  excitation.  The  vectors  U;  and  Vj  would  be  zero  except  for  a  single  entry,  and  so 
select  the  locations  of  the  excited  electrons  out  of  and  3>.  Proposition  8  then  reduces  to 
the  Slater- Condon  rules  [13]. 


III. 7  Antisymmetric  Inner  Product  with  T  and/or  V  Present 

Since  T  and  V  both  have  the  structure  of  a  sum  of  one-directional  operators,  we  state  the 
formulas  for  their  sum,  although  of  course  they  can  be  treated  individually. 

Proposition  11  If  L  from  (36)  is  nonsingular, 

($,  (T  +  V)$)_A  C=  II  ‘M'Xi),  ~\Al  +  v(ri)  j  II  (51) 

is  equal  to 

+  (52) 

Proof:  We  follow  the  same  procedure  as  we  used  for  the  electron-electron  operator  W  in 
Section  III. 6.  Instead  of  (43)  we  have  the  simpler  expression 


w 

IV! 


I  +  ((T*  +  V*)  [^]  (7)0(7)  -  e4)  e* 


d'y . 


(53) 


Applying  Proposition  2  we  obtain  (52). 


□ 


To  analyze  the  computational  cost  to  compute  (52),  we  note  that  it  costs  0(NM)  to 
apply  (%  +V*)[-].  Including  the  cost  for  the  maximum  coincidence  transformation,  our 
total  cost  is  thus  0(N2(N  +  M )). 
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III.  7.1  The  Singular  Case 

We  now  state  the  formula  when  L  is  singular.  The  analysis  is  similar  to  that  for  W  in 
Section  III. 6.1. 

Proposition  12  If  the  rank- deficiency  of  L  is  greater  than  one,  (51)  evaluates  to  zero.  If 
it  is  equal  to  one  we  have 

-J—J(%  +  V*)[**v1}u*1*d'y.  (54) 

To  compute  (54),  it  costs  0(NM)  to  form  3?*Vi  and  u)4>,  and  O(M)  to  apply  +  V*)  [•]. 

IV  Incorporating  Delta  Functions  into  the  Antisym¬ 
metric  Inner  Products 

In  this  section  we  show  how  to  compute  antisymmetric  inner  products  when  one  of  the 
component  functions  is  replaced  by  a  delta  function.  For  concreteness,  we  will  replace 

0i(7i)  by  -  7i)- 

IV.  1  Lowdin’s  Rule  with  <5( 7  —  71)  Present 

The  matrix  L  from  (36)  is  defined  by  L(i,j )  =  (0j,  (j>j).  If  we  replace  <^1(71)  by  5(y  —  71), 
then  the  first  row  depends  on  7  and  is  given  by  L(l,j)  =  (6(7  —  •),  (f>j)  =  (j>j{ 7).  We  thus 
have  a  matrix  that  depends  on  7, 

0  1(7)  02(7)  •••  <M  7) 

(02, 0l)  (0  2,  $2)  ■■■  (02,0iv) 

L(7)  =  •  .  .  •  (55) 

(0iV,0l)  2)  •••  (0iV,0iv) 

To  compute  with  £,(7)  without  resorting  to  cofactor  expansions,  we  express  £,(7)  as  a  rank- 
one  perturbation  of  a  matrix  of  numbers.  Define 

‘  d(  1)  d(  2)  •  •  •  d(N)  ' 

(02, 0l)  (02,02)  •••  (02 ,0jv)  / 

E  =  .  .  ,  (56) 

.  (0jv,0  1)  (0jv,0 2)  •••  (0AT,0Af)  . 

where  the  vector  d*  is  chosen  to  be  a  unit  vector  orthogonal  to  the  remaining  rows  of  E. 
This  choice  assures  that  the  rank  deficiency  of  E  will  be  smaller  than  or  equal  to  the  rank 
deficiency  of  the  matrix  with  any  other  first  row.  It  also  gives  us  some  convenient  properties, 
namely  Ed  =  ei,  d*E*  =  e*,  E*ei  =  d,  and  e)E  =  d*,  where  E*  is  the  modified  pseudo¬ 
inverse  of  Definition  4.  It  costs  0(N2M)  to  construct  E  and  0(N3)  to  compute  E+  and 
|E|. 

We  then  have 

L(y)  =  E  +  ei($(y)  -  d)*  (57) 

and,  with  the  help  of  Proposition  2,  compute 

|L(7)|  =  |E||X  +  d($(7)  -  d)*|  =  |E|  (1  +  ($(7)  -  d)*d)  =  |E|  *(7)*d ,  (58) 


which  yields 
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Proposition  13 


/  N  N  \ 

U7-7i)nto,nM  =|E|$(7),,d,  (59) 

\  i= 2  i=l  /  _4 

where  E  and  d  are  defined  as  above. 

Remark  14  If  i  >  1  t/ien 

(|E|  $*d,  &)  =  |E|<$,  &)*d  =  |E|£?(t,  -)*d  =  0 ,  (60) 

since  d  is  orthogonal  to  which  is  row  number  i  of  E.  Thus  the  function  (59)  is 

orthogonal  to  <fi  for  i  >  1.  The  same  propeHy  will  hold  when  the  operators  T,  V,  and  W 
are  present  in  the  antisymmetric  inner  product,  as  described  in  the  following  sections. 


IV. 2  Antisymmetric  Inner  Product  with  £(7  —  71)  and  (T  and/or 
V)  Present 

To  compute  antisymmetric  inner  products  involving  operators,  we  will  modify  formulas  from 
Section  III.  The  first  (trivial)  modification  is  to  denote  the  variable  of  integration  in  those 
formulas  by  7',  so  as  not  to  confuse  it  with  the  variable  7  in  S(j  —  71).  Next  we  replace  |L| 
with  |L(7)|  given  by  (58).  Using  (57),  we  can  express 


L^r1  =  (E  +  ed$(7)  -  d)*)-1  =  (E  (I  +  d($(7)  -  d)*)))-1 

=  (I  +  d(4>(7)  —  d)*)-1  E-1 .  (61) 


Using  the  Sherman-Morrisson  Formula  (see  e.g.  [21]  and  (B5)  in  Appendix  B)  we  then  have 


l(7)-1 


d(«fr(7)-d)*  \ 
l  +  ($(7)-d)*d  / 


E"1 


(d-$(7))*\ 
$(y)*d  ) 


E"1 . 


(62) 


The  vector  of  functions  0,  which  was  defined  by  L”1^?,  now  depends  on  the  variable  7 
in  5(7  —  71)  as  well  as  its  own  internal  variable  7'.  Replacing  L-1  with  (62)  and  with 
4>(7')  +  ei(5(7  —  7')  —  ^4  (7')),  we  obtain 

0(7, 7')  =  )  E_1  (^('7^  +  ei^(7  ~  V)  -  ^(V)))  ■  (63) 

To  compute  it,  we  first  compute  the  base  case  ©(7')  =  E_14>(7').  Multiplying  out  (63)  and 
noting  d*0  =  d*E*#  =  <^>1,  we  obtain 


0(7,7') 


0(7')  +  d 


d*0(7') 


^(7) *0(7')  +  5(7  -  7')  -  ^1(7') 

$(7)*d 

A/./i  ,^(7)*©(7')  -  5(7-7') 

=  0(7)~d - ¥(^d - 


(64) 


We  are  now  ready  to  state  our  main  formulas. 
Proposition  15  When  E  is  nonsingular, 


N 


N 


5(7-71)  Ipi(7i)>Cr  +  v)II  Mli) 


(65) 
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is  equal  to 


*(7)*  (d  J  (T*  +  V*)  [$]*0d7'  -  /  (T*  +  V*)  [**d]0d7^ 

+  (T*  +  V*)  [<fr*d](7)]  ,  (66) 

which  can  be  computed  with  total  cost  0{N 3  +  N2M). 

Proof:  To  compute  (65),  we  start  with  f  {%  +  V*)  [$]*0c?7'  from  (52)  and  substitute 
in  (58)  and  (64)  to  obtain 


m 

N\ 


|E|$(7)*d 

W\ 


’w  MIH*!/  $(7)*0(7,)-<5(7-7/)  )  J  ,  ,c^ 

(T,+V*)[$](7)  0(7)-d - - Jd7‘ 


Distributing  out  and  rearranging,  we  have 


§|  [  ^(7)*d(T*  +  V*)  W*(7')©(7')  -  {%  +  V*)  [$](7,)*d$(7)*0(7O 


+  (T*  +  V*)  [^](7,)*d5(7  -  7')d7' ,  (68) 


which  yields  (66).  Although  in  (62)  and  (64)  we  divide  by  3>*d,  which  could  be  zero,  this 
denominator  cancels  in  the  final  expression,  so  we  can  argue  by  continuity  that  the  final 
expression  is  still  valid.  One  can  also  prove  this  directly  by  determining  the  nullspace  of  L 
and  then  using  (54).  □ 


Remark  16  It  is  the  term  with  pointwise  multiplication,  (T*  +  V*)  [3>*d]  in  (66),  that  al¬ 
lows  adaptive  refinement  around  the  nuclei  in  the  numerical  algorithm. 

To  obtain  the  formulas  when  E  is  singular,  we  follow  the  same  logic  as  in  Section  III. 6.1. 
Denote  the  singular  vectors  in  the  nullspace  of  E  by  {(u,;,  v-,)}. 

Proposition  17  When  E  has  rank  deficiency  greater  than  one,  (65)  is  zero.  When  E  has 
rank  deficiency  one,  (65)  is  equal  to 

^^$(7)*  (d  J  (%  +  v.)  [***!]«;*<*/  -Vij  {%  +  v*)  [<r  d]ut4>d7')  ,  (69) 

which  can  be  computed  with  total  cost  0(N 3  +  N2M). 

IV. 3  Antisymmetric  Inner  Product  with  5(7  —  71)  and  W  Present 

Conceptually  the  derivation  if  W  is  present  in  the  inner  product  is  the  same  and  we  obtain 
the  following  propositions. 

Proposition  18  When  E  is  nonsingular, 

/  n  n  \ 

(  d(7  -  7l)  Mji),  W  (fo(7i)  \ 

\  i=2  i= 1  / 


(70) 
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is  equal  to 
1  |E 


2  M  [2  (*(7)*dw„  [**  ©]  (7)  -  *{1TW 


©$*d 


$(7)*  (  d  j  $*©w. 

-2  /  ew. 


$*© 


-  #*yw 


©$’ 


@dy 


$*© 


$*d-  ©$*VW 


©$*d 


dry' 


which  can  be  computed  with  total  cost  0(N 3  +  N2M\ogM). 
Proposition  19  When  E  has  rank  deficiency  one,  (70)  is  equal  to 

1 


n[(*W 


dW, 


|Et|lV 

4*(7)‘ (<•/*■*(*! 

+  J  Q^*v{Wv 


(7)  -  $(7)*viW1= 


u*$3>*d 


$*© 


u^$$*d 


-  Vi  /  $’d  u;$W, 


$*© 


-wv 

-wv 

-wT 


u  l<f>$ 

$*ViU 


9)  d7' 

$*d)  d7' 


0)  di 


which  can  be  computed  with  total  cost  0(N 3  +  N2M  +  NMlogM). 
Proposition  20  When  E  has  rank  deficiency  two,  (70)  is  equal  to 

1 


|Et|JV! 


$(7)*  d  /  ^viu^Wp  $*v2u^ 


-  **v2W- 


—  Vi  J  «t>*V2U2^W f 

-  v2  J 

which  can  be  computed  with  total  cost  0(N3  +  NM  +  MlogM). 


$*du^ 

-  **v2W„ 

u^$*d 

$*du?j3> 

-  ^*v1Wv 

u^dy 

u  l&dy 


(71) 


(72) 


(73) 


V  Details  of  the  Green’s  Function  Iteration 

In  this  section  we  fill  in  the  missing  pieces  in  the  Green’s  function  iteration  algorithm 
outlined  in  Section  II. 2.  First  we  give  a  representation  for  the  Green’s  function  itself.  Then 
we  use  the  methods  in  the  previous  sections  to  construct  the  vector  b  in  (18)  and  the  matrix 
A  in  (17)  to  form  the  normal  equations  (15).  Next  we  give  the  algorithm  from  Section  II. 2  in 
outline  form  as  pseudocode.  Finally  we  gather  the  computational  cost  of  the  whole  method, 
and  present  some  linear  algebra  techniques  to  reduce  it. 

V.l  Representing  the  Green’s  Function 

In  this  section  we  construct  a  separated  representation  for  the  Green’s  function  in  (7), 
following  the  ideas  in  [4,  5]  (see  also  [22,  23]).  We  will  use  this  representation  in  Section  V.2 
when  constructing  the  right-hand-side  of  the  normal  equations. 
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We  begin  by  constructing  an  approximation  of  1  /t  with  exponentials  such  that 


1 

t 


L 

wP  exp (-Tpi) 

p=  i 


<  e, 


(74) 


on  the  interval  t  G  [1,  oo),  with  uip  and  rp  positive.  Expansions  of  1  /t  into  exponentials  have 
been  used  in  several  applications  and  constructed  by  diverse  techniques;  see  [8,  29,  59,  6,  9, 
24]  and  the  references  therein.  The  interval  [1,  oo)  is  addressed  specifically  in  [9],  where  it  is 
shown  that  the  error  rate  e  =  0(exp(—cy/L))  can  be  achieved,  which  means  we  can  achieve 
L  =  0{{  lne)2). 

Substituting  t  =  s/(— /z)  for  p  <  0  into  (74)  and  dividing  by  —  p,  one  has 


1 

s 


p—1 


(75) 


valid  on  the  interval  s  £  [—p,  oo).  In  Fourier  coordinates,  we  can  express 

c  1 

from  which  we  see  that  || QA\  =  1  /(— /z).  Since  the  denominator  is  at  least 
substitute  into  (75)  and  obtain 


(76) 


— /z  >  0,  we  can 


p„-t. 


ECU 

-n 

p=i 


N  o  2 

0»p(~g) 

i=l  A 


< 


=  4Gp 


(77) 


Thus  we  obtain  an  approximation  of  Gp  with  relative  error  e  in  norm  using  L  terms,  with 
L  independent  of  N  and  p.  To  construct  Gp  as  an  integral  operator  in  spatial  coordinates, 
we  apply  the  inverse  Fourier  transform  to  obtain 

L  N 

(78) 

p—l i=l 

where  the  convolution  operator  ,  which  depends  implicitly  on  /z,  is  defined  by 


f  (li  ?.•••)  In  )  = 


-/zeT 


l/N 


lL 


2ttt, 


exP  1  -77T llr*  -r 

ZTp 


P 
/II 2 


3/2 


/(71, . . . ,  7i_i,  (r',  Vi),  7i+i, . . . ,  7jv)dr' .  (79) 


This  construction  has  theoretical  value,  since  it  has  proved  the  following  theorem. 


Theorem  21  For  any  e  >  0,  /z  <  0,  and  N,  the  N -particle  Green’s  function  Qf,  has  a  sepa¬ 
rated  representation  with  relative  error  in  operator  norm  bounded  by  e  using  L  =  0(( lne)2) 
terms,  with  L  independent  of  p  and  N . 


V.2  Constructing  the  Right-Hand-Side  Vector  b  in  (18) 

In  order  to  do  a  step  in  the  iteration,  we  need  to  construct  the  right-hand-side  b  in  the 
normal  equations  (15)  in  Section  II. 2. 2.  Since  A  is  an  orthogonal  projection,  A  and  Gf, 
commute,  and  Gf,  is  self-adjoint,  the  entry  (18)  is  equal  to 

r  /  N  N  \ 

b(l)h)  =  -Si  ^2sm  (AGpSi'j  -  71)  n [V  +  W]  )  •  (80) 

m  \  i—2  i=  1  / 
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Substituting  (78)  in  for  and  rearranging,  we  have 


N 


N 


HO (7)  =  -si  ^2 s™  ( -4jr?/(7  -  71)  II ^$(7*),  [v  +  w]  JJ  C(t i) 


p= 1 


i=2 


i=l 


(81) 


The  computation  is  of  the  same  form  for  each  value  of  the  indices  l,  in ,  and  p,  so  we  can 
consider  a  single  term  and  suppress  the  indices. 

To  evaluate  a  single  term  (AFriS( 7  -  71)  11=2  [V  +  W]  ni=i  </>i(li))  we  use 

the  formulas  in  Propositions  15-20  in  Sections  IV. 2  and  IV. 3,  with  two  modifications.  The 
first  modification  is  that  $  is  replaced  with  throughout.  This  replacement  causes  no 
structural  change  to  the  formulas;  it  just  changes  the  inputs.  The  second  modification  is 
caused  by  the  replacement  of  <5(7  —  71)  by  J-ri5{ 7  —  71)-  The  first  row  of  1,(7)  in  (55) 
becomes  ^"$(7)*,  which  makes  |L(7)|  =  |E|JF$(y)*d.  Similarly,  (64)  becomes 


0(7,7') 


a,  >(V) -^(7 -7') 

0(7 }  ^  d - - 


(82) 


Tracking  T  through  the  formulas,  we  find  that  all  we  need  to  do  is  to  modify  the  formulas 
in  Sections  IV. 2  and  IV. 3  by  applying  T  to  the  final  result. 


V.3  Constructing  the  Matrix  A  in  (17) 

In  this  section  we  construct  the  kernels  in  (17)  for  the  normal  equations  (15),  using  the  same 
ideas  as  in  Section  IV.  We  fix  l  and  l1  and  define 


K{  7,V)  = 

W(Y)  = 
y(j)  = 

D  = 

Using  Lowdin’s  rules  (37)  we  have 


A(U')(7,Y) 

SlSi / 

[  ^(Y)  •••  $v(y)  r 

[  ^2(7)  •••  4>n{ 7)  ]* 

(Y2A2)  (4>^4>n) 

.  <&,&>  ■"  (0JVi0J\r)  - 


K(l,Y) 


w 

N\ 


1 

m 


<5(7-7')  y*(7) 

w(y')  D 


Expressing  L  as  a  low-rank  perturbation  of 


1  0 
0  ID 


we  have 


(83) 

(84) 

(85) 

(86) 


(87) 


K(  7,7') 


1 

m 

'  1  0 

0  D 

+ 

'  1  ' 
0 

[  0  y*(7)  ]  + 

1 

1  0 

1  + 

'  1  ' 

[  0  y*(7)  ]  + 

m 

0  D 

0 

5(7-7')  -  1 

w(7') 

5(7  -  7')  -  1 

D-1w(7/) 


[1  0] 
[1  0] 


m  1  y*(7)D"1w(7') 

N\  1  5(7-7') 


(5(7  -  7') 


y*(7)D  ^(7'))  . 


(88) 
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If  D  is  singular  then  we  apply  the  same  logic  as  in  Section  III. 6.1.  If  ID  has  rank-deficiency 
greater  than  one  then  K( 7,7')  =  0.  If  it  has  rank-deficiency  one  then  we  have  K( 7,7')  = 


|B*|IV! 


[  0  y*  (7)  ] 


1  0  0  v*0^w(7/) 

-y*(7)v  1  y*(7)Dtw(V)  = 

|U  |iV-  0  1  (5(7-7') 


<5(7  -  70  -  1  1  r  1  0  1 
B*w(7')  J  L  u  J 


-(y*(7)v)(v*tfw(7Q) 

\Bt\N\ 

-(y*(7)v)(u*w(70) 

\m\N\ 


,  (89) 


where  B*  is  the  modified  pseudo-inverse  of  Definition  4. 

In  the  nonsingular  case,  we  can  construct  B  at  cost  0{N2M)  and  compute  B_1  at  cost 
0(N3).  Applying  this  kernel  costs  O(NM)  to  integrate  against  a  function  in  7',  0(N2) 
to  apply  B_1,  and  then  O(NM)  to  apply  y*  to  the  result.  In  the  singular  case,  we  can 
compute  B*  at  cost  0(N3)  and  construct  y*v  and  u*w  at  cost  0{NM).  Since  the  variables 
separate,  applying  this  kernel  costs  O(M). 


Remark  22  In  the  case  r  =  1,  which  corresponds  to  the  Hartree-Fock  formulation,  B  =  I 
and  Kfjj'y')  is  just  the  projector  orthogonal  to  {4>%}f-2- 


V.4  Pseudocode 

In  this  section  we  give  the  algorithm  in  outline  form  as  pseudocode.  We  do  not  indicate 
when  objects  can  be  recalled  or  updated  from  previous  computations. 

Loop  through  I  Green’s  function  iterations  (9,10,13).  For  each  of  these: 

Construct  QIL  as  in  Section  V.l,  obtaining  the  operators  Tv  in  (79). 

Loop  through  the  N  directions  (electrons).  For  each  of  these: 

Compute  A(l,l')  via  (88)  for  all  ( l, l '). 

Compute  6(i)(7)  in  (81)  by: 

Loop  in  the  r  values  of  l  and  for  each: 

Sum  over  the  L  values  of  p  and  for  each: 

Compute  tFp(j)li  for  all  i. 

Sum  over  the  r  values  of  m  and  for  each: 

Using  in  place  of  4>,  construct  E  in  (56). 

Compute  |E|  and  E_1. 

Construct  ©  =  E_1iF$. 

Construct  4>*0,  <l>*d.  and  0$*. 

Compute  Wv  [$*©]  and  Wv  [©$*  . 

Compute  (66)  and  (71)  using  these  ingredients. 

Apply  IFP  to  ((66)  +  (71)). 

Apply  conjugate  gradient  to  solve  the  normal  equations  (15). 

Renormalize  as  in  (10). 

Update  fi  via  (13). 

Remark  23  We  have  presented  the  algorithm  in  serial  form  for  clarity.  The  loop  in  l, 
sum  in  p,  and  sum  in  m  can  be  trivially  parallelized.  Parallelizing  the  loop  through  the  N 
electrons  would  represent  a  change  in  the  algorithm,  which  we  will  develop  elsewhere. 
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V.5  Overall  Computational  Cost 

The  computational  cost  is  dominated  by  the  repeated  construction  and  solution  of  the 
normal  equations  (15).  For  a  fixed  direction,  the  construction  cost  is  dominated  by  (81), 
which  has  r2L  inner  products.  The  most  costly  portion  of  the  inner  products  is  (71),  which 
requires  0(N3  +  N2 M  log  M)  operations,  giving  us  the  net  construction  cost 

0(r2 LN2(N  +  MlogM)) .  (90) 

The  operation  count  to  solve  the  normal  equations  (15)  by  applying  the  matrix  of  integral 
operators  A  S  times  is 

0(r2SN(N  +  M)) .  (91) 

As  we  loop  through  the  directions,  we  may  reuse  several  quantities,  so  the  total  cost  of 
the  construction  is  less  than  N  times  the  cost  for  one  direction.  In  fact,  the  construction  cost 
for  the  entire  loop  through  N  directions  is  of  the  same  order  as  the  cost  for  one  direction. 
The  application  cost  is  simply  multiplied  by  N .  In  the  sections  below  we  show  how  to 
update  the  construction  for  direction  k  =  2  using  what  we  already  have  for  direction  k  =  1, 
and  then  determine  the  cost  for  one  loop  through  the  directions.  We  defer  the  development 
of  the  technical  linear  algebra  rules  on  low-rank  updates  to  Appendix  B,  and  here  only  show 
how  to  apply  them  to  our  problem.  Our  final  conclusion  is  the  computational  cost 

0(Ir2N2[L{N  +  MlogM)  +  S{N  +  M)]),  (92) 

where  /  the  number  of  Green’s  function  iterations. 


V.5.1  Reuse  in  Computing  A 

Let  Oi  denote  D  in  (86)  for  directions  one,  and  O2  the  version  for  direction  two.  We  let  <f>\ 
denote  the  updated  version  of  4>\ .  To  construct  O2  requires  only  the  first  column  and  row 
of  ID>i  to  be  updated,  specifically 


©2  =  ©i  +  ei  [  0  {{4>{,4>l)  - 


{4>[,4>i)  -  (02>0 


(93) 


Computing  those  inner  products  involving  <j>\  and  (j>\  costs  O(NM).  Using  Proposition  24 
twice,  we  compute  ©|,  |D||,  and  if  appropriate  v,  all  at  cost  0(N2).  The  formulas  (87)  and 
following  are  modified  by  inserting  the  extra  column  and  row  in  the  second  place  instead  of 
the  first,  but  otherwise  the  procedure  is  unchanged.  The  cost  for  one  loop  through  the  N 
directions  is  thus  0(N3  +  N2M). 


V.5. 2  Reuse  in  Computing  Antisymmetric  Inner  Products  with  <$(7  —  71)  and 
Operators 

We  again  let  <j>\  denote  the  updated  version  of  <j>[  computed  during  the  k  =  1  solve.  The 
inner  products  needed  to  construct  E2  require  only  the  one  row  involving  <f>  1  to  be  updated, 
at  cost  O(NM).  The  vector  di  can  be  constructed  by  doing  the  SVD  of  Ei  with  the  first 
row  set  to  zero  and  then  selecting  one  of  the  right  singular  vectors  v.(  with  zero  singular 
value.  Using  Proposition  24  we  obtain  the  SVD  of  E2  with  first  row  set  to  zero  and  second 
row  containing  the  new  inner  products,  and  thus  can  find  d2.  Putting  the  first  and  second 
rows  back  in  proper  position,  we  then  have 

E2  =  Ei  +  ei  ([  •••  {T(j>i,(j>N)  ]  ~  d*) 

+  e2(d^-  [  (TfaAi)  •••  <^2,<M]),  (94) 
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and  we  can  compute  |E||  and  E|  using  Proposition  24  twice,  at  cost  0(N2). 

Proposition  24  produces  a  rank  two  update  and  we  must  apply  it  twice.  For  notational 
ease  we  will  show  how  to  use  a  rank  one  update  applied  once;  the  method  easily  extends. 
Assuming  E|  =  Ej  +  fg* ,  we  next  update 


02  =  e|.f42  =  (Ef  +  fg*)(^$i  +  ei(&  -  4>i)) 

=  ©1  +  di((£i  -  4>x)  +  fg*.F$i  +  fg*ei((^i  -  0i)  (95) 


at  cost  O(NM).  It  is  insufficient  to  just  update  ©2  in  this  way,  since  it  would  still  cost 


0(N2M\ogM)  to  compute  Wv 


02$* 


in  (71).  Instead  we  update  the  combined  quantity 


= 


©!$* 


$*diW> 


$*fg*ei>Vp 


{(p i  -  <£i)3>  (96) 


at  cost  0(NM  log  M) .  With  this  quantity  and  02  we  can  compute  (71)  at  cost  0(NM  log  M) . 
The  singular  cases  work  similarly.  The  cost  for  one  loop  through  the  N  directions  is  thus 
0(N2M  log  M). 
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A  Appendix:  Algorithms  Based  on  Gradient  Descent 

We  prefer  the  integral  iteration  in  Section  II. 2.1  due  to  the  generally  superior  numerical 
properties  of  integral  formulations.  One  could,  however,  try  to  minimize  (4)  directly  with  a 
method  based  on  gradients.  Since  the  machinery  that  we  have  constructed  applies  to  these 
methods  as  well,  we  sketch  how  it  can  be  used. 

To  minimize  (4)  we  could  use  a  gradient  descent,  starting  at  some  initial  guess  for  ip. 
Inserting  our  current  approximation  xp  and  formally  taking  the  gradient,  we  have 

2  w.  ^)a  -  v^).4  cA1') 

(V’A )/ 

Defining  fi  to  be  our  current  value  of  (4),  the  gradient  reduces  to 

^)A  ( ~  vv^)  ■  (A2) 

The  gradient  is  with  respect  to  the  parameters  that  are  used  to  minimize  (4).  In  our  case 
that  is  the  values  of  the  functions  <pj.  Taking  the  gradient  with  respect  to  the  point  values 


an  Unconstrained  Sum  of  Slater  Determinants 


25 


of  4>lj  results  in  a  vector  g  of  functions,  defined  by 

2  r  /  jv  n  \ 

4W  =  TT~h)~' Sl  )  .  (A3) 

\*C5  r/,4  m— 1  \  i=l  /  ^ 

where  5(7  —  7j)  is  the  delta  function.  The  methods  in  Section  IV  can  be  used  to  construct 
g- 

Moving  t  in  the  direction  opposite  the  gradient  replaces  ip  with 

r  N 

sz  n  (A4) 

1=1  i= 1 

Some  search  procedure  can  then  be  used  to  find  an  appropriate  t.  Then  if)  is  updated  and 
the  procedure  repeated. 

Alternatively,  we  can  use  an  alternating  direction  approach  and  take  the  gradient  with 
respect  to  the  functions  <p\  for  one  direction  i,  while  fixing  the  functions  in  the  other  di¬ 
rections,  and  then  loop  through  the  directions.  This  loop  through  the  directions  is  then 
repeated  /  times  until  we  obtain  the  desired  accuracy.  We  describe  the  i  =  1  case.  Moving 
t  in  the  direction  opposite  the  gradient  replaces  ip  with 

r  N  r  N 

si{4>[  -tg[)\\_(p\  =  ip-t^2sig[\\_<pli  =  ip -tip. 

1=1  i= 2  1=1  i= 2 

Inserting  (A5)  into  (4)  results  in 

(H(ip  —  tip),  ip  —  tip^  (Hip,ip)A  —  2t(TLip,^  +t2(nip,'ipj 

-t^, ip-  t4>y a  {ip,  ip)A  -  2  t(ip,  ip^ + t2(ip,  ipy ^ 

Once  the  inner  products  have  been  computed,  we  can  find  the  minimizer  for  (A6)  by  solving 
a  quadratic  equation,  and  then  update  ip  via  (A5).  The  cost  to  construct  g  for  one  direction 
is  r2  times  the  cost  for  one  inner  product.  The  dominant  cost  for  the  inner  product  comes 
from  (71),  which  costs  0(N3  +  N2M\ogM),  giving  us  the  net  construction  cost 

0{r2N2(N  +  MlogM)) .  (A7) 

As  described  in  Section  V.5.2,  many  of  the  computations  can  be  reused,  so  the  cost  for  a 
single  loop  through  the  N  directions  is  of  the  same  order.  Thus,  for  I  loops  through  the 
directions  the  overall  computational  cost  is 

0(Ir2N2(N  +  MlogM)).  (A8) 


(A5) 


(A6) 


B  Appendix:  Low-rank  Updates 


In  this  section  we  develop  formulas  for  low-rank  updates  to  At,  A-1  and  | A4 1 ,  based  on 
[42,  3], 


Proposition  24  Given  A,  At,  A-1,  | A4 1 ,  b, 


and  c,  let  Ai 


+  be*  and  compute 


d  =  Atb, 
d  =  d*d, 

A  =  l_+c*Atb, 
p  =  Ad  +  dg, 


e  =  (At)*c, 
e  =  e*e, 

/i  =  |A|2  +dg , 
q  =  Ae  +  ef. 


f  =  (I- AAt)b, 
/  =  f*f, 

v  =  |A| 2  +e/, 


g  =  (I  - 


(Bl) 


g  =  g  g, 
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1.  If  A  =  0,  /  =  0,  and  g  =  0,  then  rank( Ai)  =  rank( A)  —  1  and 

—  d_1dd*A^  +  e_1(— A^e  +  (A1(d*A^e)d)e*  (B2) 

A^  =  A^  +  (l/v/de)de*  (B3) 

|Aj|  =  -(l/V*)|At|.  (B4) 

2.  If  A  =£  0,  /  =  0,  and  g  —  0,  then  rank( Ai)  =  rank( A)  and 

A\  =  Af  -  A_1de*  (B5) 

Aj1  =  A1*  (B6) 

|Af  |  =  lA^IA”1 .  (B7) 


3.  If  f  =  0  and  g  ^  0,  then  rank( Ai)  =  rank(A)  and 


A|  =  A '  — /i  id(<gd*AT  +  Ae*)  +  fi  g(— de*  +  Ad 
AX  =  AX  _  jAK^-IAIIg  +  Agd 

/.  If  f  7^  0  and  g  =  0;  t^en  rank(Ai)  =  rank(A)  and 

Aj  =  A^  —  zv_1(/A^e  +  Ad)e*  +  n_1(— ed  +  AA^e)f* 


tf  =  Ax  -  A^f 


(IAIO  -  |A|)f  +  A/e)* 


/|A|v^ 


4l  = 


(A  -  A) |  A|2  +  \v 


v\\\s/v 

5.  If  f  7^  0  and  J  /  0,  t/ien  rank( Ai)  =  rank(A)  +  1  and 

At  =  A*  -  r  xdf*  +  ff-'gC-e*  +  A/_1f*) 

A^  =  Ax  -  (1/Vfl?)gf* 

I  All  =  |A*|  [l  +  (s"1/"1  -  (1/ V5/))g*A“Lf' 


(B8) 

(B9) 

(BIO) 


(BH) 

(B12) 

(B13) 


(B14) 

(B15) 

(B16) 


The  cost  to  compute  Aj,  Af,  and  | A| |  is  0(N2). 

Proof:  The  overall  method,  update  rules  for  rank( Ai),  and  update  rules  for  A)  are  taken 
from  [3],  who  also  list  the  useful  properties 

c*d  =  e*b  =  A  —  1,  b*f  =  /,  c*g  =  o,  d*g  =  0,  e*f  =  0, 

.  .  .  (B17) 

At  Ad  =  d,  AAte  =  e,  A*f  =  Atf  =  0,  Ag  =  (A*)*g  =  0. 

They  give  update  rules  for  the  row  and  column  spans  of  Ai ,  which  we  translate  into  update 
rules  for  A^.  The  cases  (B3),  (B6),  and  (B15)  follow  directly.  Corresponding  to  (B9),  their 
update  rule  is  that  the  row  span  of  A-  should  be  extended  (orthogonally)  by  d  and  then 
reduced  by  projecting  orthogonal  to  p.  We  translate  this  into  a  (Householder)  reflection 
of  the  vector  g  into  a  vector  in  the  span  of  d  and  g  perpendicular  to  p.  Adjusting  these 
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vectors  to  have  equal  norm  and  real  inner  product  yields  the  reflection  of  the  vector 
to  —  |A|(gd  —  Ag),  resulting  in 


I- 


2(A^Zg+  |A|(ffd  -  Ag))(A^g  +  |A|(gd  -  Ag))* 


(B18) 


tKAy^gd-  |A|(ffd-  Ag))||2 

which  simplifies  to  (B9).  To  obtain  (B12)  we  use  the  same  process,  extending  the  column 
span  by  e  and  then  projecting  orthogonal  to  q  by  a  reflection  of  Xy/vf  to  — |A|(/e  —  Af). 

To  derive  the  update  rules  for  |A*|,  first  add  the  update  rules  for  and  A ^  and  then 
take  the  determinant.  On  the  right  hand  side  factor  out  a  copy  of  A*  leaving  a  low-rank 
perturbation  of  the  identity,  to  which  we  can  apply  Proposition  2.  To  simplify  the  results, 
we  use  (Bl),  (B17),  and  the  further  observations 

(Al)-1d  =  b  —  f,  (Al)-1Afe  =  c  —  g,  (A^)_1g  =  (AJ-)*c, 
e*(Al)-1  =  c*  -  g*,  f*(Al)_1  =  b*(AJ-)*. 

To  obtain  (B4)  we  compute 

|l  —  d_1bd*A^  +  ((l/\/de)b  —  e_1e  +  <i~1e~1(d*A^e)b)e, 


(B19) 


1  -  d~1d*Atb  d*At((l/Vde)b  -  e^e  +  d*Ate)b) 

1  +  e*((l/v/de)b  —  e_1e  +  d_1e_1(d*A'l'e)b) 


— d_1e*b 
0 


d*At(l/>/tfe)b 


=  |A*|(— (1  /Vde)).  (B20) 


For  (B7)  we  have  | A|  |  = 
we  compute 


e*((l/v/de)b  +  d  1e  1(d*Ate)b) 

[  -  A-1be*|  =  | A* |(1  -  A_1e*b)  =  lA^A’1 .  To  obtain  (BIO) 


d(— jj  (pd*AT  +  Ae*)  — 


Ag*A^ , 


\My/d 

+g +  Ad*At)  -  (v^~  lAljg'A±)A 

9yfd  ) 

1  +  {-gTx{gd*^  +  Ae*))b  (//  +  Ad*At))*b 

(— Ag*AJ-/|A|y//I)(AJ-)*c  1  -  ((y/M  -  IAI)g*A±/gy/fl*(A±)*c 

(A  -  A)|A|2  +  Xfj, 

M|A|  y/U 


X/fj,  d/ g 
-Xg/\X\JJ1  |A|/VM 


(B21) 


A  similar  calculation  yields  (B13).  To  obtain  (B16)  we  compute 

I  +  (A*)-1  (-/"Mr  +  g(<7_1(— e*  +  A/-1f*)  -  (1/ y/tf)V)) 

1  f*(A±)*c 

g-\f-\X  -  1)  1  +  (g-'Xf-1  -  (l/y/gj))f*  (A±)*c 

=  |At|(l  +  ((-(l/v^))  +g-1r1)f*(A±yc) 

"l  +  GT1/" 1  -  (l/v/5/))g*A"Lf 


(B22) 

□ 


When  A  and  Ai  are  nonsingular,  (B5)  is  the  Sherman-Morrisson  Formula  (see  e.g.  [21]). 
For  our  application  we  need  the  singular  vectors  in  A-1-,  rather  than  A1-  itself,  but  then  only 
when  rank(A± )  <  3.  These  singular  vectors  can  be  extracted  by  a  simple  modification  of 
the  power  method  with  deflation. 


an  Unconstrained  Sum  of  Slater  Determinants 


28 


References 

[1]  H.  Agren,  A.  Flores- Riveros,  and  H.J.Aa.  Jensen.  Evaluation  of  first-  and  second-order 
nonadiabatic  coupling  elements  from  large  multiconfigurational  self-consistent-field  wave 
functions.  Physical  Review  A,  34(6):4606-4614,  December  1986. 

[2]  Philippe  Y.  Ayala  and  H.  Bernhard  Schlegel.  A  nonorthogonal  Cl  treatment  of  symmetry 
breaking  in  sigma  formyloxyl  radical.  J.  Chem.  Phys.,  108(18):7560-7567,  May  1998. 

[3]  Jerzy  K.  Baksalary,  Oskar  Maria  Baksalary,  and  Gtz  Trenkler.  A  revisitation  of  fomulae  for 
the  Moore-Penrose  inverse  of  modified  matrices.  Linear  Algebra  Appl,  372:207-224,  2003. 

[4]  G.  Beylkin  and  M.  J.  Mohlenkamp.  Numerical  operator  calculus  in  higher  dimensions.  Proc. 
Natl.  Acad.  Sci.  USA,  99(16):10246-10251,  August  2002. 

http : //www.pnas . org/ cgi/ content /abstract /112329799vl. 

[5]  G.  Beylkin  and  M.  J.  Mohlenkamp.  Algorithms  for  numerical  analysis  in  high  dimensions. 
SIAM  J.  Sci.  Comput.,  26(6):2133-2159,  July  2005. 

http : //amath. Colorado . edu/pub/wavelets/papers/BEY-M0H2005 .pdf. 

[6]  G.  Beylkin  and  L.  Monzon.  On  approximation  of  functions  by  exponential  sums.  Appl. 
Comput.  Harmon.  Anal.,  19(l):17-48,  2005. 

http : //amath. Colorado . edu/pub/wavelets/papers/af es .pdf. 

[7]  Gregory  Beylkin,  Vani  Cheruvu,  and  Fernando  Perez.  Fast  adaptive  algorithms  in  the 
non-standard  form  for  multidimensional  problems.  J.  Comp.  Phys.,  2006.  Submitted.  APPM 
Preprint  #550. 

[8]  D.  Braess.  Asymptotics  for  the  approximation  of  wave  functions  by  exponential  sums.  J. 
Approx.  Theory,  83(1):93-103,  1995. 

[9]  Dietrich  Braess  and  Wolfgang  Hackbusch.  Approximation  of  1/x  by  exponential  sums  in 
[1,  oo).  IMA  J.  Numer.  Anal,  25(4):685-697,  2005. 

[10]  Rasmus  Bro.  Parafac.  tutorial  &  applications.  In  Chemom.  Intel l.  Lab.  Syst.,  Special  Issue 
2nd  Internet  Conf.  in  Chemometrics  (INCINC’96),  volume  38,  pages  149-171,  1997. 
http : //www. models .kvl . dk/user s/rasmus/pre sent at ions /par af  ac_tutorial/paraf .htm. 

[11]  Eric  Cances,  Mireille  Defranceschi,  Werner  Kutzelnigg,  Claude  Lc  Bris,  and  Yvon  Maday. 
Computational  quantum  chemistry:  a  primer.  In  Handbook  of  Numerical  Analysis,  Vol.  X, 
pages  3-270.  North-Holland,  Amsterdam,  2003. 

[12]  Bin  Chen  and  James  R.  Anderson.  A  simplified  released-node  quantum  monte  carlo 
calculation  of  the  ground  state  of  LiH.  J.  Chem.  Phys.,  102(ll):4491-4494,  March  1995. 

[13]  E.U.  Condon  and  G.H.  Shortley.  The  Theory  of  Atomic  Spectra.  Cambridge  Uinversity  press, 
1967. 

[14]  Lieven  De  Lathauwer,  Bart  Dc  Moor,  and  Joos  Vandewalle.  On  the  best  rank-1  and 
rank-(J?i,  R2,  •  •  •  ,  Rn)  approximation  of  higher-order  tensors.  SIAM  J.  Matrix  Anal.  Appl, 
21(4):1324-1342,  2000. 

[15]  Fokke  Dijkstra  and  Joop  H.  Van  Lenthe.  On  the  rapid  evaluation  of  cofactors  in  the 
calculation  of  nonorthogonal  matrix  elements.  International  Journal  of  Quantum  Chemistry, 
67:77-83,  1998. 

[16]  Fokke  Dijkstra  and  Joop  H.  Van  Lenthe.  Gradients  in  valence  bond  theory.  Journal  of 
Chemical  Physics,  113(6):2100-2108,  August  2000. 

[17]  Frank  Ethridge  and  Leslie  Greengard.  A  new  fast-multipole  accelerated  Poisson  solver  in  two 
dimensions.  SIAM  J.  Sci.  Comput.,  23(3):741-760  (electronic),  2001. 

[18]  Reinhold  Fink  and  Volker  Staemmler.  A  multi-configuration  reference  CEPA  method  based 
on  pair  natural  orbitals.  Theoretica  Chimica  Acta,  87(1/2):129-145,  November  1993. 

[19]  P.  Gilbert.  The  reconstruction  of  a  three-dimensional  structure  from  projections  and  its 
applications  to  electron  microscopy  II.  Direct  methods.  Proc.  R.  Soc.  Lond.  B.,  pages 
89-102,  1972. 


an  Unconstrained  Sum  of  Slater  Determinants 


29 


[20]  T.  L.  Gilbert.  Multiconfiguration  self-consistent-field  theory  for  localized  orbitals.  I.  The 
orbital  equations.  Phys.  Rev.  A,  6(2),  August  1972. 

[21]  G.  Golub  and  C.  Van  Loan.  Matrix  Computations.  Johns  Hopkins  University  Press,  3rd 
edition,  1996. 

[22]  W.  Hackbusch  and  B.  N.  Khoromskij.  Low-rank  Kronecker-product  approximation  to 
multi-dimensional  nonlocal  operators.  I.  Separable  approximation  of  multi-variate  functions. 
Computing,  76(3-4):  177-202,  2006. 

[23]  W.  Hackbusch  and  B.  N.  Khoromskij.  Low-rank  Kronecker-product  approximation  to 
multi-dimensional  nonlocal  operators.  II.  HKT  representation  of  certain  operators. 

Computing,  76(3-4) :203-225,  2006. 

[24]  Wolfgang  Hackbusch.  Entwicklungen  nach  exponentialsummen.  Technical  Report  4, 
Max-Planck-Institut  fur  Mathematik  in  den  Naturwissenschaften,  Leipzig,  Germany,  2005. 
see  also  http://www.mis.mpg.de/scicomp/EXP_SUM/. 

[25]  R.J.  Harrison,  G.I.  Fann,  T.  Yanai,  and  G.  Beylkin.  Multiresolution  quantum  chemistry  in 
multiwavelet  bases.  In  P.M.A.  Sloot  et.  al.,  editor,  Lecture  Notes  in  Computer  Science. 
Computational  Science-ICCS  2003,  volume  2660,  pages  103-110.  Springer,  2003. 

[26]  R.J.  Harrison,  G.I.  Fann,  T.  Yanai,  Z.  Gan,  and  G.  Beylkin.  Multiresolution  quantum 
chemistry:  basic  theory  and  initial  applications.  J.  Chem.  Phys.,  121(23):11587-11598,  2004. 
http : //amath. Colorado . edu/pub/wavelets/papers/mrqc .pdf. 

[27]  Richard  A.  Harshman.  Foundations  of  the  parafac  procedure:  Model  and  conditions  for  an 
“explanatory”  multi-mode  factor  analysis.  Working  Papers  in  Phonetics  16,  UCLA,  1970. 
http :  //publish.uwo  .  ca/  ^harshman/wpppf  acO  .pdf. 

[28]  T.  Helgaker  and  P.R.  Taylor.  Modern  Electronic  Structure  Theory.  World  Scientific, 
Singapore,  1995. 

[29]  Tomasz  Hrycak  and  Vladimir  Rokhlin.  An  improved  fast  multipole  algorithm  for  potential 
fields.  SIAM  J.  Sci.  Comput.,  19(6):1804-1826  (electronic),  1998. 

[30]  Walter  Hunziker.  On  the  spectra  of  Schrodinger  multiparticle  Hamiltonians.  Helv.  Phys. 

Acta,  39:451-462,  1966. 

[31]  M.  H.  Kalos.  Monte  Carlo  calculations  of  the  ground  state  of  three-  and  four-body  nuclei. 
Phys.  Rev.  (2),  128:1791-1795,  1962. 

[32]  M.  H.  Kalos.  Monte  Carlo  integration  of  the  Schrodinger  equation.  Trans.  New  York  Acad. 
Sci.  (2),  26:497-504,  1963/1964. 

[33]  Tosio  Kato.  Fundamental  properties  of  Hamiltonian  operators  of  Schrodinger  type.  Trans. 
Amer.  Math.  Soc.,  70:195-211,  1951. 

[34]  W.  Klopper.  R12  methods,  Gaussian  geminals.  In  J.  Grotendorst  et.  al.,  editor,  Modern 
Methods  and  Algorithms  of  Quantum  Chemistry,  volume  1  of  NIC  Series,  pages  153-201. 
John  von  Neumann  Institute  for  Computing,  2000. 

[35]  Wim  Klopper  and  Claire  C.  M.  Samson.  Explicitly  correlated  second-order  Mpller  Plesset 
methods  with  auxiliary  basis  sets.  Journal  of  Chemical  Physics,  116(15),  April  2002. 

[36]  Pieter  M.  Kroonenberg  and  Jan  de  Leeuw.  Principal  component  analysis  of  three-mode  data 
by  means  of  alternating  least  squares  algorithms.  Psychometrika,  45(l):69-97,  1980. 

[37]  C.  Le  Bris,  editor.  Handbook  of  Numerical  Analysis.  Vol.  X.  North-Holland,  Amsterdam, 
2003.  Special  Volume:  Computational  Chemistry. 

[38]  S.  E.  Leurgans,  R.  A.  Moyeed,  and  B.  W.  Silverman.  Canonical  correlation  analysis  when  the 
data  are  curves.  J.  Roy.  Statist.  Soc.  Ser.  B,  55(3):725-740,  1993. 

[39]  Roland  Lindh,  Jeppe  Olsen,  and  Bjorn  O.  Roos.  Low-rank  configuration  interaction  with 
orbital  optimization — the  LR  SCF  approach.  Chemical  Physics  Letters,  148(4) :276-280,  July 
1988. 


an  Unconstrained  Sum  of  Slater  Determinants 


30 


[40]  Per-Olov  Lowdin.  Quantum  theory  of  many-particle  systems.  I.  Physical  interpretations  by 
means  of  density  matrices,  natural  spin-orbitals,  and  convergence  problems  in  the  method  of 
configuration  interaction.  Physical  Review,  97(6):1474-1489,  March  1955. 

[41]  Arne  Liichow  and  Reinhold  Fink.  On  the  systematic  improvement  of  fixed-node  diffusion 
quantum  Monte  Carlo  energies  using  natural  orbital  Cl  guide  functions.  J.  Chem.  Phys., 
113(19):8457-8463,  November  2000. 

[42]  Carl  D.  Meyer,  Jr.  Generalized  inversion  of  modified  matrices.  SIAM  J.  Appl.  Math., 
24:315-323,  1973. 

[43]  Martin  J.  Mohlenkamp  and  Lucas  Monzon.  Trigonometric  identities  and  sums  of  separable 
functions.  The  Mathematical  Intelligencer,  27(2):65-69,  2005. 

http :  //www.math.  ohiou.  edu/  ~mjm/re  search/ sine  .  pdf. 

[44]  Martin  J.  Mohlenkamp  and  Todd  Young.  Convergence  of  Green  iterations  for  Schrodinger 
equations.  In  Xiaoping  Shen,  editor,  Proceedings  of  International  Workshop  on 
Computational  Science  and  its  Education  2005,  (to  appear). 

[45]  Thomas  Muir.  A  Treatise  on  the  Theory  of  Determinants.  Privately  published,  Albany  New 
York,  1930.  revised  and  enlarged  by  William  H.  Metzler. 

[46]  Jozef  Noga,  Werner  Kutzelnigg,  and  Wim  Klopper.  CC-R12,  a  correlation  cusp  corrected 
coupled-cluster  method  with  a  pilot  application  to  the  Be2  potential  curve.  Chemical  Physics 
Letters,  199(5),  November  1992. 

[47]  Jeppe  Olsen,  Per-Ake  Malmquist,  Bjorn  O.  Roos,  Roland  Lindh,  and  Per-Olaf  Widmark.  A 
non-linear  approach  to  configuration  interaction.  The  low-rank  Cl  method  (LR  Cl). 

Chemical  Physics  Letters,  133(2):91-101,  January  1987. 

[48]  Ruben  Pauncz.  The  Symmetric  Group  in  Quantum  Chemistry.  CRC  Press,  Boca  Raton,  FL, 
1995. 

[49]  B.  Joakim  Persson  and  Peter  R.  Taylor.  Accurate  quantum-chemical  calculations:  The  use  of 
gaussian-type  geminal  functions  in  the  treatment  of  electron  correlation.  J.  Chem.  Phys., 
105(14):5915-5926,  October  1996. 

[50]  B.  Joakim  Persson  and  Peter  R.  Taylor.  Molecular  integrals  over  gaussian-type  geminal  basis 
functions.  Theor.  Chem.  Acc.,  97:240-250,  1997. 

[51]  V.V.  Prasolov.  Problems  and  Theorems  in  Linear  Algebra,  volume  134  of  Translations  of 
Mathematical  Monographs.  American  Mathematical  Society,  Providence,  R.I.,  1994. 

[52]  Michael  Reed  and  Barry  Simon.  Methods  of  Modern  Mathematical  Physics.  II.  Fourier 
analysis,  self-adjointness.  Academic  Press  [Harcourt  Brace  Jovanovich  Publishers],  New 
York,  1975. 

[53]  Michael  Reed  and  Barry  Simon.  Methods  of  Modem  Mathematical  Physics.  IV.  Analysis  of 
operators.  Academic  Press  [Harcourt  Brace  Jovanovich  Publishers],  New  York,  1978. 

[54]  Franz  Rellich.  Storungstheorie  der  Spektralzerlegung.  V.  Math.  Ann.,  118:462-484,  1942. 

[55]  Sven  Peter  Rudin.  Configuration  Interaction  with  Non- orthogonal  Slater  Determinants 
Applied  to  the  Hubbard  Model,  Atoms,  and  Small  Molecules.  PhD  thesis,  The  Ohio  State 
University,  1997. 

[56]  Jacek  Rychlewski,  Wojciech  Cencek,  and  Jacek  Komasa.  The  equivalence  of  explicitly 
correlated  slater  and  gaussian  functions  in  variational  quantum  chemistry  computations,  the 
ground  state  of  H2.  Chemical  Physics  Letters,  229:657-660,  November  1994. 

[57]  C.  David  Sherrill  and  Henry  F.  Schaefer  III.  The  configuration  interaction  method:  Advances 
in  highly  correlated  approaches.  Advances  in  Quantum  Chemisty,  127:143-269,  1999. 

[58]  Age  Smilde,  Rasmus  Bro,  and  Paul  Geladi.  Multi-way  Analysis.  Applications  in  the  Chemical 
Sciences.  John  Wiley  &  Sons,  2004. 

[59]  N.  Yarvin  and  V.  Rokhlin.  Generalized  Gaussian  quadratures  and  singular  value 
decompositions  of  integral  operators.  SIAM  J.  Sci.  Comput.,  20(2):699-718  (electronic),  1999. 

[60]  Jurgen  Zanghellini.  Multi- Electron  Dynamics  in  the  Ionization  of  Molecules  by  Strong  Laser 
Pulses.  PhD  thesis,  Vienna  University  of  Technology,  Vienna,  Austria,  2004. 


